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Abstract 


This thesis defines a property called "view-serializability,” which formalizes internal consistency 
for a system of nested atomic transactions. Internal consistency is a stronger condition than the usual 
notion of database consistency, because it takes into account the views of transactions which will never 
commit. In a distributed system, local aborts of remote subactions and crashes of nodcs can generate 
orphans. active actions which are descendants of actions that have aborted or are guaranteed to abort. 
Because it is not always feasible or efficient to elimate orphans immediately, special care is needed to 
insure that they see consistent system states when they are allowed to continue running. We investigate a 
particular dynamic detection strategy designed to detect orphans before they violate internal consistency. 
This algorithm piggybacks abort and crash information on the normal messages between nodes. We 
consider a simpler algorithm that only handles orphans arising from explicit aborts. We describe the 
simplified orphan detection algorithm at various levels of abstraction, using an algebraic model 
convenient for describing asynchronous systems. The highest-level model is specified in terms of a 
(virtual) global state. At this level of abstraction we require that the states generated by the model satisfy 
view-serializability. Lower-level models progressively localize the description of the algorithm’s 
operation, and the lowest level of abstraction presents a fully distributed model of the (simplified) orphan 
detection scheme. 
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1. Introduction : 


Production of concurrent programs is a much more difficult task than production of sequential 
programs. The sequential nature of human thought severely limits programmers’ ability to manage the 
complexity of parallel processes. Distributed environments compound these difficulties; robust programs 
must cope with non-local failures and with incomplete information about the global state of a system. 
Primitives developed for local, sequential programming have proven inadequate for software 
development in distributed, concurrent systems. Additional mechanisms have been suggested which 
allow programmers to think about concurrent programs for distributed systems using largely sequential 


reasoning. 


Current research [Liskov82, Best81] stresses use of the afomic transaction as a tool for 
distributed software. Atomic transactions can insulate users from both the effects of concurrency and the 
effects of failures, greatly simplifying reasoning about a system. If transactions are truly atomic, then 
neither users nor the transactions themselves should see the effects of concurrency or failures. Our 


concern is with the internal consistency property of transactions’ views. 


Recent proposals have extended the transaction model to include nested transactions, which 
allow sub-pieces of a transaction to run concurrently and fail independently [Reed78, Moss81]. In such a 
system the independent failure of (sub)transactions can generate orphan processes -- active processes 
which are running on behalf of a failed transaction. (We will refine and extend this definition below.) 
Orphans complicate the implementation of atomicity; insuring that orphans’ views of the system state are 
“consistent” with atomicity requires a more sophisticated algorithm than one which ignores orphans’ 


views. 


This thesis develops a formal model of a distributed nested transaction system, and it shows that 
the model satisfies a correctness condition representing “consistency of views." Our transaction system 


model includes a dynamic orphan detection scheme, which detects and exterminates orphans before they 


see inconsistencies. This model is based on the design for the Transaction Manager of the Argus language ~ 


under development by the M.1.T. Distributed Systems Group [Liskov82]. Although the models in this 
thesis simplify both the assumptions made by Argus about the distributed environment, and the specifics 
of the Argus orphan detection algorithm, the results contribute to confidence in the correctness of this 
algorithm. 


1.1 Nested Transactions 
1.1.1 Transactions and Atomicity 


An atomic transaction is a computation that appears to occur instantaneously and indivisibly 
from the point of view of any observer of its effects (except for an observer “inside” the transaction). 
("Observer" here might refer to another transaction, or to a user of transactions.) If all operations on a 
system take place through atomic transactions, then each transaction will have the illusion that it is run in 
isolation: the effects of concurrency are not visible to any transaction. This synchronization property is 
often referred to as serializability: for any observer (including the transactions themselves), the system 
State seems to be the result of a serial execution of transactions. An execution of transactions can be 
scrializable without being serial (as a trivial example, if no two transactions acccss the same data objects, 


then any execution is serializable). 


Another property of atomic transactions is failure atomicity. each transaction appears to have 
run completely or not at all. An atomic transaction cannot “partially complete.” A transaction which 
runs to completion is said to "commit;” a transaction which fails (and has no effect) is said to “abort.” 
Failure atomicity simplifies specification of the possible effects of a transaction, since only "good" 


executions must be considered. 


Atomic transactions simplify reasoning about a system because the effects of concurrency and 
failures can be ignored. Atomicity implics that if an integrity constraint (an invariant) on the system state 
is preserved by all transactions when run in isolation and to completion, then this invariant is preserved 
by any (possibly concurrent) execution of these transactions. Local, sequential reasoning can be directly 


applicd to a distributed, concurrent environment. 


1.1.2 Nested Transactions 


Nested transactions extend the usual single-level structure of transactions to a hietarchical 
structure. A nested transaction can contain other nested (sub)transactions, cach of which is atomic with 
respect to the others. Nesting can be arbitrarily deep. Usual terminology for hierarchical relationships 
applies to nested transactions. (Thus we refer to the “parent transaction” of a given transaction, or to its 
“children,” etc. “Ancestor” and "descendant" are considered reflexive; “proper ancestor” and “proper 


descendant” are the corresponding irreflexive relations.) 


The child transactions of any transaction can run concurrently; their concurrent exccution must 
be serializable. Children can also commit or abort independently; a child commits to its parent, and its 
effects will be undone if the parent subsequently aborts. It follows that permanent changes to the system 
state occur only when top-level transactions commit. (For details of the semantics of nested transactions, 


see [Moss8}].) 


Nested transactions provide at least three advantages over single-level transactions: The ability 
to create concurrent children at any level increases the overall parallelism in a system, which might result 
in efficiency gains. Secondly, the independent abort of a child confines the effects of failure to the work 
done by that child; the parent can take an appropriate action without aborting itself. This failure isolation 
improves program robustness and simplifies error recovery. Finally, a program (or a transaction at any 
level) can use (sub)transactions without regard to their internal concurrency. Concurrency need not be 


completely specified at the top level, permitting a decentralized design strategy. 


1.1.3 Distributed Environment 


Two differences between distributed and centralized systems make nested transactions 
particularly appropriate for distributed environments. First, because distributed systems provide real 
concurrency, a systematic method for managing parallelism becomes both necessary and desirable (for 
efficiency). Second, the failure modes of distributed systems are much more complex than failure modes 
of centralized systems because parts of the system can fail independently. For example, one node in a 
network can go down without affecting other nodes, or the network can fail without directly affecting any 
node. The nested transaction model allows applications to isolate these failures naturally. (Failure 
isolation also contributes to node autonomy: an application running at one node maintains control over 


the state at that node even if it spawns subtransactions at other nodes.) 


1. For modifications to remain permanent when nodes crash, each node must provide stable storage, and top-level commit must 
insure that all changes are written to stable storage. 
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1.2 The Argus System 


Although we have attempted to make the models in this thesis relatively general, the Argus 
system has been used as a starting point. We summarize here the characteristics of Argus which are 
relevant to this work. Argus is a programming language intended to support distributed applications; this 
language requires an extensive runtime system (for example, to handle transaction management). For 


details on the language, sce [Liskov82]. 


The distributed environment of Argus consists of a set of nodes fully connected in some fashion 
by a network. Nodes can crash at any time, and recover after an arbitrary down period. Storage at a node 
is divided into volatile and stable storage; the contents of volatile storage are lost when the node crashes, 


while the contents of stable storage survive crashes. 


Nodes communicate by sending messages on the network. Delivery of messages is not 
guaranteed: messages can be lost, duplicated, delayed arbitrarily, and reordered (i.e., delivered in an 
order other than the order in which they were sent). The network can be partitioned for any period of 
time. If one node attempts to send a message to another node, it might be unable to distinguish between 


a lost message, a partitioned network, a crashed respondent, or a respondent that is slow to answer. 


Data in the system is partitioned into objects; objects are atomic or non-atomic. We assume that 
all objects are atomic. (Unconstrained use of non-atomic objects is discouraged in Argus; non-atomic 
objects are provided as loopholes to allow users to implement atomic types which are more efficient than 
the "basic" atomic types provided by the system.) While a precise definition of atomic objects is beyond 
the scope of this thesis (sec [Weih!82] for a discussion), we assume that all atomic objects are implemented 
using two-phase locks with a stack of versions as described in [Moss81]. When an action holds a lock on 
an atomic object, other unrelated actions are excluded from accessing the object.” (Chapter 8 defines a 


structure which models the lock and version stack of an atomic object.) 


Computation is carried out through actions, which are atomic transactions. A (sub)action runs 
completely at one node, though it can spawn child actions at other nodes. Remote subactions are created 
by a remote procedure call, which sends a message from the originating node to the remote node. This 


message can contain parameters computed at the parent node. If the message is received correctly, the 


2. Moss distinguishes between read and write operations; we will ignore this distinction for simplicity. 


“TVs 


subaction runs and can return a message to the parent. The child can commit to its parent, in which case 
results can be passed back to the parent with a commit message, or it might abort. The parent can abort 
the child at any time, but this abort is local to the parent’s node; the child might still be running at its own 
node. The parent cannot "commit" the child: the child is committed at the parent’s node only if a 
commit message is received from the child. We say an action commits to onc of its ancestors if all actions 
“between” that action and its ancestor commit. We say an action commits through the top level if all 


ancestors of that action commit. 


Effects of actions are written to stable storage when their top-level ancestor action commits. A 
two-phase commit protocol insures that the top-level action commits everywhere or not at all (again, 
consult [Liskov82] for details). If a node crashes after an action runs there, and that action has committed 
to its ancestor top-level action, then the crash will be detected during two-phase commit. Thus the 
top-level action will be aborted. It follows that a crash which undoes the effects of an action (i.e. a crash 
which precedes the recording of that action’s effects on stable storage) guarantees that some ancestor of 
that action will abort. (This ancestor might not be the top-level ancestor: a lower ancestor might abort, 


and then the crashed node would not necessarily be checked at two-phase commit.) 


1.3 Orphans 


An orphan is an active action that is guaranteed not to commit through the top level. Yn Argus, 


orphans can be created in two ways: a proper ancestor can explicitly abort, or a crash can occur. 


1.3.1 Creation of Orphans 


Argus allows parent actions to unilaterally abort their children, because user requirements 
might make it unacceptable to wait for confirmation of the abort from the child’s node. Complete 
confirmation would require that each aborted child recursively abort its active children; thus the parent 
would have to wait until all descendants of the child were halted. Since one of the main reasons for 
aborting the child might be that the child is not responding (perhaps because of a network partition, or 
because the child’s node crashed), waiting for descendants to be halted could delay the parent 
indefinitely. Some applications cannot tolerate the possibility of indefinite delay. 


Since a parent action can abort a child at the parent’s node only, aborted children (and their 
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descendants) might still be- active, and might thus be orphans. These orphans are a necessary 
consequence of a user requirement for bounded delay; they are not the result of a “lazy extermination" 


strategy. 


Orphans result from a node crash when an active action at that node has active descendants at 
other nodes. This situation is similar to the case of explicit aborts since the active ancestor is effectively 
“aborted” by the crash. A more complex type of orphan generation occurs when a crash releases a lock 
held by an action which has committed up to some ancestor, but not through the top level. The lowest 
active ancestor, and all its active descendants, become orphans since they are guaranteed not to commit 
through the top level. Since this lowest active ancestor might abort -- or be aborted by its parent -- the 
crash need not affect higher ancestors. If the lowest active ancestor commits to its parent, the parent and 
all active descendants of the parent become orphans. If the "infected" ancestor commits to its top-level 
ancestor, then the crash will be detected during two-phase commit, and the top-level ancestor will abort. 


This type of orphan could be prevented by keeping locks and versions in stable storage. 


1.3.2 Problems Created by Orphans 


Orphans are unpleasant, though necessary, side-effects of aborts and crashes. Since their effects 
are destined to be undone, exterminating orphans cannot do harm. The main concern of this thesis is 


with the possible adverse consequences of not exterminating orphans "soon enough.” 


1.3.2.1 Resource Wastage 


Orphans consume resources and compete with non-orphans for these resources. Orphans can 
deadlock with non-orphans, causing non-orphans to be aborted unnecessarily (depending on the 
deadlock strategy). Resource allocation problems are unlikely to be severe unless orphans are created 
very frequently. While efficiency issues might be crucial for a working system, this thesis only addresses 


the semantic problems associated with orphans. 
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1.3.2.2 Internal Consistency 


The transaction management algorithm described in [Moss81] does not guarantee atomicity 
from the point of view of orphans. Orphans can observe system states which are not consistent with 
serializability (i.e., they can observe the effects of concurrency). Moss’s algorithm does not preserve 
internal consistency. The orphan detection algorithm described in the next section is designed to 


guarantee internal consistency. 
We present two examples of such inconsistencies: 


1. (See Fig. 1.1. Note that conventions for figures appear in Appendix I.) Initially integers x 
and y (at different nodes) have values 0. There is an integrity constraint on the system state 
that x = y. Action Al runs, reads x = 0, (does not modify x), and commits to A. A then 
holds a lock on x. (See [Moss81] for a detailed description of the locking protocol.) A then 
spawns action A2 (passing A2 the information that x = 0), and then A aborts (after the 
message is sent to create A2), making A2 an orphan. The abort of A releases A’s lock on x, 
allowing B to run to compiction and increment both x and y through concurrent children B1 
and B2. B commits, releasing its locks on x and y. If A2 (now an orphan) is allowed to read 
y, it will view y = 1, which allows A2 to infer that x ¥ y (an “inconsistent” view, since x = y 
will always hold for any serial schedule). 


Fig. 1.1. Orphan from Explicit Abort 


a 


x,0 y x,0 y.0 


orphan 


-14- 


2. (See Fig. 1.2.) As in the above scenario, integers x and y (at different nodes) have initial 
values 0, and there is an integrity constraint on the system state that x = y. The same events 
occur as above: Al reads x and commits, and A creates A2. Instead of an abort at A, 
however, x's node crashes. This crash releases A’s lock on x, and it makes A (and thus A2) an 
orphan. As above, B then runs to completion and increments both x and y. B commits, 
releasing its locks on x and y. If A2 (now an orphan) is allowed to read y, it will view y = 1, 
which allows A2 to infer that x ¥ y. 


It is not clear whether internal consistency is an important concern for a transaction system. 
One might argue that orphans’ views are not important, since orphans will be aborted (eventually) 
anyway. Since all actions expect a serializable system history, however, programs might function 
“correctly” only when their views are consistent. Their behavior when views are not consistent might be 
unpredictable or even catastrophic. (For example, an program guaranteed to terminate under normal 
conditions might be non-terminating when faced with an inconsistent view.) Orphans could also transmit 
their inconsistent views to outside parties, via channels which are not under the control of the transaction 
system. For example, when a user interactively debugs a process that is an orphan, he sees the orphan’s 
(possibly inconsistent) view. This inconsistency might mislead the user, since he might have no direct 


way of determining that his process is an orphan. A system which permits terminal output by any action 


Fig. 1.2. Orphan from Crash 
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suffers the same problem. (Since terminal output is irreversible, the effects of any aborted action cannot 
be undone. The orphan’s output represents a worse problem, however, since this output might reflect an 


inconsistent state.) 


1.3.3 Orphan Detection Scheme 


The basic orphan detection strategy in Argus piggybacks abort and crash information on all 
channels of information flow between actions. This additional information is used to infer that processes 


are orphans; these processes are then exterminated. 


Our execution model ignores crashes; we deal only with orphans arising from explicit aborts. 
(We believe that the correctness condition for internal consistency that we develop in Chapter 3 should 
also apply to a model which includes crashes, although we have not investigated ctashes in detail.) We 
present here a bricf description of an orphan detection scheme similar to the portion of the Argus 
algorithm which handles explicit aborts. The transaction system model we develop is based on this 
scheme. Our simplified algorithm ignores many of the optimizations envisioned for the actual Argus 


algorithm. 


User programs at nodes communicate via remote procedure calls and returns. In addition to 
these messages, transaction system messages are sent between nodes to update the status of actions as they 
commit and abort. Commit and abort messages update the locks and versions of atomic objects. There 
are many possible strategies for communicating commit and abort information. For example, when an 
action commits or aborts, a commit message could be sent immediately to all nodes where descendants of 
that action have run. Alternatively a querying strategy could be used where querics are sent about the 
status of an action only when another action wants a lock held by that action. (The commit and abort 
messages would then be possible responses to a query.) Our model will not focus on these strategies; we 
focus on the orphan information which is attached to messages whenever these messages are sent. We 
regard the return message from a remote procedure call as a commit or abort message, depending on 
whether the child committed or aborted. The return message might include return values, but since our 
concern is only with orphan information we need not distinguish between return messages and 


transaction system messages. 


Our model has three types of messages: create, commit, and abort messages. A create message 


models a remote procedure call. Although in Argus a “create” message will only be sent directly from a 
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parent node to a child node, for simplicity we assume that a create message can be sent indirectly through 
any other node. Communication in our model is very unrestricted; essentially any node can send a 
message to any other node at any time. The messages that a node can send are limited by what is known 
at that node (c.g., a node can only send a "commit A” message if it knows that A is committed), and by 


rules for piggybacking orphan information on messages. 


The orphan information at each node is a list, DONE, of known aborts. Any action which is a 
descendant of an action in DONE is an orphan and is exterminated. Our rules for piggybacking orphan 
information are quite simple: a create or commit message must include the entire DONE list from its 
sending node; this list is added to the DONE list at the receiving node when the message is received, and 


known orphans are exterminated. An abort message need not include any information from DONE. 


The information flow in this algorithm for the example given in Fig. 1.1 is shown in Fig. 1.3. 
When A aborts, the abort message releasing Al's lock on x adds A to x’s node’s DONE list. This DONE 
list is transmitted to B’s node when B] commits. After B2 runs and commits to B, and B commits, y’s 
node will eventually learn of B’s commit. The message that B has committed will contain the DONE 
from B’s node (which now includes A). Thus y’s node will know about A’s abort. The commit message 
of B releases B's lock on y, but A2 is now a known orphan at y: A2 is exterminated before it can acquire 


the lock on y and see an inconsistent state. 


. The flow of crash information is similar to the flow of DONE information. (We describe the 
mechanism only superficially here; the actual algorithm is quite complex.) The basic scheme requires 
each node to maintain a stable crash count, which is incremented during recovery from any crash. The 
orphan information relating to crashes consists of currently known crash counts for nodes plus the crash 
counts seen by actions when they ran at these nodes. An orphan is detected when it is discovered that a 
crash count “depended on” by an action (essentially a crash count for a node at which a committed 
relative has run) is lower than the currently known crash count for the same node. The discrepancy in 


crash counts implies that a node crash must have occurred since a committed relative ran at that node. 
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Fig. 1.3. Orphan Detection 


A runs at node M, B at node N 
Ai,B1 run at node X (object x resides at X) 
AZ2,B2 run at node Y (object y resides at Y) 


(1) Al runs and commits to A; A spawns A2 (A2 has not read y) 
B spawns B1; Bi waits because A holds a lock on x 


DONE LOCKS/VERSIONS 


T 
S/ Ne M i) 
A B N r) 
\ CZ X @ x=0, held by A 
A1.c mp AZ B1 Y ) y=0 


x,0 


(2) A aborts; abort message sent to X, releasing lock to B1. 
B1 increments x and commits; commit message sent to N (with DONE) 


DONE LOCKS/VERSIONS 
U 
y \ M {A} 
A,a B N {A} . 
X {A} x=0, held by B 
A1,c mp A2 ~ Bil,c Y y=0 


x,0 x,0 


(3) B2 runs and increments y and commits. B commits, sending commit 
message (with DONE) to X and Y. Commit of B arrives at X and Y, 
releasing B's locks. 


DONE LOCKS/VERSIONS 
U 
Pe \ M {A} 
A,’ Bc N {A} 
\ \ xX {A} x21 
A1,c «mp A2  Bi,c B2,c Y {A} y=1 


x,0 x,0  y,0 


(4) A2 is aborted because it is a known orphan at Y. 
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1.4 Related Work 
1.4.1 Transaction System 


Our transaction system model is based on the design presented in [Moss81]. Moss generalizes 
two-phase locking for nested transactions, and he develops a recovery scheme based on multiple (backup) 
versions of objects. His transaction manager functions in the presence of both node crashes and 
communications failures. He describes distributed algorithms for locking and version restoration, 
transaction management (including two-phase commit for top-level transactions), and deadlock detection. 
Although our formal model ignores many of the complexities that Moss considers (in particular, node 


crashes), it relies heavily on his basic framework. 


A different approach to nested transactions is explored by Reed in [Reed78]. This scheme uses 
timestamps ("pseudo-times”) for synchronization rather than using locks. Versions of objects associated 
with old timestamps can be used for backing up a system to a consistent state. It would be interesting to 
attempt to extend our models to a timestamp-based scheme such as Reed’s. While our lower-level 
execution models incorporate notions of locks and version stacks, the higher-level models are relatively 


general, relying only on a nesting relationship among actions and on a notion of “accessing” data. 


1.4.2 Orphan Detection Algorithms 


As mentioned above, the orphan detection algorithm we consider is based on the orphan 
algorithm designed for Argus [Liskov82]. Though we are aware of no implementations of orphan 
detection algorithms, Nelson explores several strategies for climinating the orphans which result from 
node crashes [Nelson81]. (Because his design is not based on atomic transactions, orphans from broken 
locks or explicitly aborted ancestors do not arise: his orphans are simply processes running on behalf of 
ancestors at crashed nodes.) The simplest such strategy is orphan extermination: After a node comes 
back up after a crash, it exterminates all orphans by tracing all outstanding remote calls. As we discussed 
above, an "immediate extermination” strategy would not be practical for Argus because of user 


requirements for a bounded delay. 


Because communications or node failures can delay extermination during crash recovery 
indefinitely, Nelson suggests alternate mechanisms which can be used in these (probably rare) cases. 


Orphan expiration requires that a remote call inherit a time limit from its parent; when the time limit is 
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reached the process running the call is automatically killed. Expiration can cause needless failures since 
processes can be killed even if they are not orphans. The chosen time limit should be significantly longer 


than “normal” execution times to prevent these anomalies. 


Finally, Nelson suggests a scheme which resembles the crash count mechanism in Argus: When 
complete extermination during crash recovery is delayed, a node will declare a new epoch (i.e. increment 
an “epoch” counter). All messages carry the current known epoch from the sending node. If a node 
receives a message with an cpoch greater than its known epoch, it must either exterminate all currently 
executing remote calls (assuming that they are orphans), or query the ancestors of remote calls to 
guarantee that they are not orphans. The system reaches equilibrium when all nodes have the same 
epoch. This approach is most similar to the Argus algorithm because potential orphans are detected 


dynamically based on information piggybacked onto normal information paths. 


1.4.3 Formal Models of Atomic Actions 


This thesis is a direct extension of the work described in [Lynch82]. Lynch gives the basic 
definitions for action trees and serializability that we use here. She presents an execution model (at 
several levels of abstraction) based on Moss’s transaction management algorithm, and she shows that 
these executions satisfy external consistency. Our work extends the correctness condition for executions 


to include internal consistency, and it modifies the execution models to incorporate orphan detection. 


Traditional concurrency control theory generally deals only with single-level transactions. The 
usual approach is to define a dependency relation among transactions based on reads and updates, and to 
show that acyclicity of this relation implies serializability (see [Papa79], for example). The basic theory of 
two-phase locking and scrializability for single-level transactions is developed in [EGLT76]; this work 


forms a basis for Moss’s system and hence for our models. 


A formal model for nested atomic actions is developed in [Best81]. This model is based on a 
dependency graph for events, where the notion of “dependency” is left uninterpreted. Atomicity is 
defined in terms of “collapsing” an event graph to replace a set of events (the events from an "atomic" 
action) with a single (higher-level) event. Sets of events are configured in a tree structure, representing 
the nesting relationship of actions. Acyclicity of inter-action dependencies is shown to be sufficient for 
atomicity. (Lynch uses a data dependency relation to derive a similar acyclicity condition for 
serializability.) The authors also define a condition which they claim is a generalization of two-phase 
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locking, and they show that this condition implies atomicity. 


The main difficulty with this dependency graph model is that the graphs cannot be easily related 
to executions of a transaction system. The action trees developed by Lynch are simply summaries of 
execution histories; “dependencies” are absent at this level of abstraction. (Although Lynch defines 
lower Ievel “augmented” action trees which include an ordering on accesses to data, the “dependencies” 
expressed by this ordering reflect actual modifications to data in an execution sequence.) The advantage 
of this approach is that Lynch is able to define execution models formalizing a transaction management 
algorithm, and to prove that her high-level serializability condition is satisfied by these models. This 
connection between execution models and correctness conditions (for “atomicity") is not explored in 
[Best81]. We have followed Lynch’s approach: we define a condition modeling internal consistency at a 
high level (the level of action trees), and we develop (at several levels of abstraction) a modcl of an 


orphan detection strategy which guarantees this property. 


1.5 Outline of the Thesis 


Before attempting to show that our orphan detection strategy is correct, we must develop a 
considerable amount of formal machinery. Chapter 2 presents the basic action tree model as described in 
[Lynch82]. (Some parts of this chapter are taken directly from [Lynch82]; though these definitions and 
theorems are not original work of this thesis, we include them here for completeness of presentation.) 
Serializability is defined for action trees, and a theorem is given relating serializability to acyclicity of data 


dependencies. 


Chapter 3 defines "view-serializability," which models internal consistency. We present a 
detailed argument explaining why this formal condition corresponds to our intuitive notion of “consistent 


views.” The condition is defined in terms of the action trees and serializability definitions of Chapter 2. 


Chapter 4 develops a general execution model for asynchronous systems, the “event-state 
algebra.” We explore a strategy for hierarchical correctness proofs: A correctness condition for 
executions of a system is defined using a high-level model of its behavior (an algebra); lower-level models 
are then defined which are progressively closer to the "real" system, and mappings are described between 
adjacent levels. We also describe distributed event-state algebras, which model distributed systems. 


Chapters 5 - 10 define successive levels of event-state algebras modeling a transaction system 
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with orphan detection. The correctness condition (view-serializability) appears at Level 0 (the highest 
level of abstraction). Level 7 (the lowest level of abstraction) is a distributed event-state algebra. At each 


new level we also construct 4 mapping to the previous (higher) level. 


Chapter 11 summarizes our results, and suggests possible directions for extensions to this work. 


Be 


2. Action Trees and Serializability 


This chapter gives basic definitions and lemmas for action trees and serializability. We define a 
structure called an “action tree," which is an abstraction of an execution sequence of a nested transaction 
system. Serializability (and related properties) are expressed as properties of action trees. This approach 
presents minimal constraints on the implementation of a transaction system since we make few assumptions 


about the details of concurrency control and recovery algorithms. 


2.1 Notation 


If S is a set, and o is some order which totally orders the elements of S, then <<S; 0>> denotes the 


sequence consisting of the elements of $ in the order given by o. 
If S is a set, then HS) denotes the powerset of S (the set of all subsets of S). 


If Sisaset, and f: S — SS), then we associate f with the obvious relation on S ({(s,t): t € f{s)}), and we 
use standard notation for relations. Thus we refer to the closure of a set under a function, we describe a 
function as acyclic, etc. ft denotes the transitive closure of f, and f* denotes the reflexive-transitive 


closure of f. 
2.2 Action Summaries and Action Trees 


2.2.1 Actions and Objects 


Let obj be a universal set of data objects. For each x € obj, let values(x) denote the set of 
values x can assume, including a distinguished initial value, init(x). A value assignment is a total 
mapping f: obj —> values(obj), such that Wx € obj, f{x) € values(x). 


Let act be a universal set of actions (i.e., transactions). Let U be a distinguished action. We 
assume that the actions are configured a priori into a tree, representing their nesting relationship, with U 


as the root. For every A € act - {U}, let parent(A) denote the unique parent action for A. Then 
siblings = {(A,B) € act2: parent(A) = parent(B)} 


If A € act, then children(A) = {B € act: parent(B) = A}. Let top = children(U). 
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anc(A) = the sct of ancestors of A, desc{A) = the sct of descendants of A 
prop-anc(A) = anc(A)- {A}, prop-desc(A) = desc{A) - {A} 


For A € act - {U}, define creator({A) as follows: 
A € top = creator(A) = A 
A € top = creator(A) = parent(A) 


If A,B € act, then Jet Ica(A,B) denote the least common ancestor of A and B. Let 


related = {(A,B) € act: A € anc(B) V B€ anc(A)} 
unrelated = act2 - related 


(Note that (A,B) € unrelated = Ica(A,B) € {A,B}.) 
If S is a set of actions such that WA,B € S, (A,B) € related, then we say § is an ancestor chain. 


If B € anc(A), then let AJB denote the single element of anc(B) M children(Ica(A,B)). (Note that if A € 
prop-anc(B), then Ica(A,B) = A, and AB € children(A).) 


It might be convenient for the reader to think of this a priori configuration of all possible actions 
into a tree as a preassigned “naming scheme" for actions. That is, the "name" of an action is assumed to 
carry within it information which locates that action in this universal tree of actions. In any particular 
execution, only some of these possible actions will be “activated.” The (virtual) action U, the parent of all 


top-level actions, has been added for the sake of uniformity. 


Let seq C siblings be any fixed partial order, representing sequential dependency. If (A,B) € 
seq, then A is constrained to run before B. For the sake of notational simplicity, we are assuming this 
relation is also fixed a priori; we assume that the "name" of any action carries within it information about 
which siblings the action can assume have completed. The use of an arbitrary partial order is a 
generalization of both the total order usually specified for the steps which occur within a single-level 


transaction, and the unconstrained order usually specified among the transactions themselves. 


We also assume a priori determination of which actions actually access data, which objects they 
access, and the functions they perform on those objects: Let accesses denote the leaves of the tree 
described above. (We assume U ¢ accesses, so that the set of actions is nontrivial.) Let object: accesses 
—» obj bea fixed function representing which object is read by a particular access. If object(A) = x, we 
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say that A is an access to x, and we write A € accesses(x). For A € accesses, Iet update(A): 
values(object(A)) —> values(object(A)) be a fixed function. Let sameobject denote {(A,B) € accesses”: 
object(A) = object(B)}. 


We define the relation of one set of actions covering another. This concept will be useful for sets 
of aborted actions used to detect potentially “harmful" orphans. The covering relation will express the 
fact that a set has enough information to detect a harmful orphan. Let R,S € act be any sets of actions; 
we say § covers R, and we write R < S if and only if for each element A in R, there is an ancestor of A 


in S. The following lemma gives elementary properties of the covering relation: 
Lemma 2.2.1.1: Let R,S,Q,T C€ act, A € act, then 


a ROCS=R<S 
b. Sis transitive: RSS AS<T = R<T 
ec (RIESAQK<T) = RUQ<SUT 


d.R<¢S A ancf(A)NS=B => anc(A)NR=B 


Proof: Straightforward from the definition. | 


2.2.2 Action Summaries 


We describe an abstraction of execution sequences, using a structure called an "action 
summary.” An action summary records the status of a particular set of actions (actions can be active, 
committed or aborted). It also records the data values read by committed accesses. A slightly simpler 
structure, an “unlabeled action summary” (or UAS) records the same information except for the data 


values. An “action tree” is any action summary which is a tree: 


An action summary, S, has components yertices., active., committed., aborted., and label., 


where 


~ vertices, is a finite subset of act 
~ actives, committed,, and aborted, comprise a paftition of vertices,. (These classifications 


indicate the current status of each known action. When an action is first created, it is classified as active. 
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At some later time, its classification can be changed to either committed or aborted. By “committed,” we 
mean that the action is committed relative to its parent, but not necessarily committed permanently. 
Permanent commit of an action would be represented by classification of all ancestors of the action, 


except for U, as committed.) 


- label: datasteps, — values(obj), (where datasteps. = committed, NM accesses), with label,(A) 
€ values(object(A)). (The label of an access to an object is intended to represent the value read by that 
access. Since the access has an associated function, the value which the access writes into the object is 
deducible from the value read, and therefore need not be explicitly represented.) The read and update of 
an access are assumed to occur “instantaneously” when the access commits. (If an access aborts, it has no 


label because it never sees the object.) 


Let done, = committed, U aborted,. Let status.(A) = ‘active’ (respectively, ‘committed’, 
‘aborted’) provided A € actives (respectively, committedg, aborted,). Let accesses. = vertices, nN 
accesses, accesses.(x) = vertices, NM accesses(x), and datasteps.(x) = datastcps, NM accesses(x). Let sed. 
= seqfN (vertices,)”. Let ane-seg, = {(A,B) € vertices.”: 4B’ € anc(B) NM vertices,: (A,B’) € seq}. Let 
children.(A) = children(A) N vertices. 


An unlabeled action summary has all components described above except label.. An action 
tree, T, is an action summary where vertices, is a tree rooted at U: If A € vertices; - {U}, then parent(A) 
€ vertices. 


If T is an action summary, then unlabel(T) is the UAS obtained by omitting label. Definitions 
and lemmas for UAS’s carry over to action summaries in the obvious way (by applying them to 
unlabel(T)). 


2.2.3 Visible and Dead Actions 


We describe actions whose existence is intended to be known to other actions (i.e. which are not — 
masked from those other actions by intervening aborts or active actions). We describe these properties 


for UAS’s; corresponding definitions and lemmas hold for (labeled) action summaries and action trees. 


Let T be a UAS. For A € act, let visible (A) = {B € vertices,: anc(B) M prop-desc{Ica(A,B)) 
C committed,}. That is, visible (A) is just the set of actions whose existence is (potentially) known to A 


in T, because they and all their ancestors, up to and not including some ancestor of A, have committed. 
For A € act, x € obj, let visible (A,x) = visible (A) MN datasteps{x). Let invisible (A) = _ vertices, - 
visible (A). The following Iemma, which describes elementary properties of “visibility,” is proved in 


[Lynch82]: 
Lemma 2.2.3.1: Let T be a UAS, A,B,C € act 


a. AE desc(B) A BE vertices, = BE visible,(A) 

b. A € visible,(B) = A € visible, (Ica(A,B)) 

c. AE visible (B) A B€ visible (C) = A € visible (C) 
d. A € desc(B) A C € visible,(B) = C € visible,(A) 


e. A E desc(B) A BE vertices; A A € visible(C) = B € visible(C) 


Actions which are not visible to another action might be masked by an intervening abort, or by 
active actions only. If B is masked from A by an intervening abort, we say B is dead to Ain I: if Tisa 
UAS, and A € act, we define dead (A) = {B € vertices,: anc(B) M prop-desc(Ica(A,B)) M aborted, # 
@}. Note that visible,(A) MN dead, (A) = &. If A € act, x € obj, then dead (A,x) = dead,{A) Nn 
datasteps,(x). If B is not dead to A in T, we say that B is live to Ain I. If A € vertices,, then we say A is 
live in I iff anc(A) M aborted, = @, and A is dead in T otherwise. If T is a UAS, A € vertices, and A 
is dead in T, then we define the crucial abort of A in T, denoted crucial,(A), as the lowest aborted 
ancestor of A in T: ie., if S = anc(A) N aborted, then crucial,(A) €S,and VWBES, crucial (A) € 
desc(B). (If A is not dead in T, then crucial,{A) is undefined. In this case we will consider that 
{crucial,(A)} = ©, for convenience.) 


Let T be a UAS, A € vertices, then we define 

y-seq,(A) = {B: (B,A) € seq A B # A} 2 visibic,(A) 

iseq,(A) = {B: (B,A) € seq A B # A} 2 invisible,(A) 

y-ane-seq.,(A) = {B: (B,A) € anc-seq, A B € anc{A)} N visible{A) (sce Fig. 2.1.) 


icane-seq, (A) = {B: (B,A) € anc-seq,; A B € anc(A)}}N invisible,(A) (see Fig. 2.1.) 
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v-child, (A) = children(A)N visible (A) 
i:child (A) = children(A)N invisible (A) 
vedese,(A) = deso(A) N visible (A) 


ixdese (A) = deso(A) M invisible (A) 


2.3 Augmented Action Trees 


We define a new structure called an aygmented action summary (or AAS). We can regard 
AAS’s as action summaries with an additional component: an ordering on the datasteps accessing each 
object. Formally we define an AAS as a pair T = <S,O>, where S is an action summary, and 0: obj > 
Rsameobject), where for all x € obj, O(x) is a total order on datasteps.(x). (Thus O(x) € datastepsa(x).) 
If T = <S,O> is an AAS, then we define crase(T) = S, order(f) = ©. We extend our notation for 


Fig. 2.1. Visible and Invisible Ancestor-Sequence 


B € v-anc~seq,(A) 


Pp 
Ns (or B could be active) 
\ 
\ 
\ 
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A 


B € i-anc-seq,(A) 
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components and functions of action summaries to components and functions of AAS’s in the obvious 
way, by applying them to erase(T). (For example if T = <S,O>, then we will use "vertices," to refer to 
vertices,.) Definitions for action summaries and UAS’s carry over to AAS’s in the obvious way (by 
applying them to erase(T) or to unlabel(erase(T))). If T = <S,O, then we define data, = UY oe) : 
An augmented action tree (AAT) is an AAS where crase(T) is an action tree. 


Let T be an AAS, A € vertices, then we define 
y-data,(A) = {B: (BA) € data, A B # A} / visible(A) 
i-data (A) = {B: (B,A) € data, A B# A} NM invisible (A) 
y-data-anc,(A) = {A|B} 
B € v-data,4A) 
acme ae 47 a 
y-precedes,(A) = v-anc-seq,(A) U v-child (A) U v-data-anc,{A) 
icprecedes (A) = i-anc-seq,{A) U irchild,(A) U i-data-anc,(A) 


The “visible precedence" relation, v-precedes,, will be used in Chapter 3 to define a "view tree” 
which represents an action’s view of an execution history. We state here some elementary properties of 


this relation. 


Lemma 2.3.1: Let T be an AAS, A € vertices,;. Then 
B € v-precedes,(A) =» parent(B) = Ica(A,B). 


Proof: 


1. B € v-anc-seq,{A) =» (B,A’) € seq for some A’ € anc(A), and B # A’. Thus 
parent(B) = parent(A’) = Ica(A,B). 
2. BE v-child (A) = parent(B) = A = Ica(A,B). 


3. B € v-data-anc{A) =» B = Alb for some b € accesses. Thus B € 
children(Ica(A,B)), = parent(B) = Ica(A,B). & 


Lemma 2.3.2; Let T be an AAS, A € vertices, Then B € v-precedes7 (A) = 


BE visible, (A), and B € committed. 


Proof: B € visible, (A) is Obvious from transitivity of visible, (Lemma 2.2.3.1c). To see that 
BE committed,, note that if B € visible (C) for some C, then B € committed, or B € 


anc{C). 
But B € v-precedesT(A) = BE v-precedes,(C) for some C, = B € visible,(C). But B ¢ 
anc(C), by Lemma 2.3.1, so B € committed,  § 


If T is an AAS, A € vertices,, then we define the view set of A in T as the v-precedes,-closure of 


A: yset,(A) = v-precedest(A). The following lemma gives elementary closure properties of view sets. 
Lemma 2.3.3: Let T be an AAS, A € vertices,, B € vset,{A). Then 


a. vset,{3) € vset(A) 
b. v-desc,(B) C vset,(A) 


c. v-data,(B) C vset,(A) 


Proof: (a) is obvious from the definition. v-desc, closure (b) follows inductively from 


v-child, € v-precedes,. We show (c): 


Suppose C € v-data,{B). Then BC € v-data-anc,{B) 
= BICE vset,(A), since vset (A) is v-precedes, closed. 


C € visible,{B) = C € v-desc(BJC). 


But vset,{A) is v-desc,-closed by (b), = C € vset,{A). | 


The following lemma gives an ancestor-closure property for view sets (the view set itself is not 
ancestor closed, but the view set of an action together with the proper ancestors of that action forms an 


ancestor-closed set). 
Lemma 2.3.4: Let T be an AAS, A € vertices,. If W = vset,{A) U prop-anc(A), then W is 
anc-closed. 


Proof: Let V = vset,(A). We show inductively that B € V = anc(B) C W. Since B € 
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prop-anc(A) = anc(B)-C prop-anc(A) C W, anc-closure follows. 
Basis: A € W, anc(A) € {A} U prop-anc(A). 


Induction: Assume B € V, anc(B) C W, and take C € v-precedes,(B). By Lemma 2.3.1, 
parent(C) = Ica(B,.C) = prop-anc(C) C anc(B) C W. ButC € V = {C} C W = anc(C) 
Cw. f 


2.4 Serializability 


We define serializability for action trees. Let T be an action tree. A partial order p € siblings 


is linearizing for T provided p totally orders all siblings in T. A linearizing partial order p induces a total 


order, induced, p’ OM accesses, in the obvious way: (A,B) € induced, , = (BLA,AJB) Ep. IFAE 
accesses(x) and p is a linearizing partial order for T, let preds, pA) denote the sequence <<{B € 
visible,(A,x): (B,A) € induced, A B# A}; induced, >>. 


If x € obj and s is some finite sequence of accesses, then we define result(x,s) as follows: If is 
the empty sequence, then result(x,s) = init(x). Otherwise let s = s’A, where A € accesses. Then 


result{(x,s) = update(A Xresult(x,s’)) if A is an access to x, = result(x,s’) otherwise. 


A linearizing partial order p for T is said to be a serializing partial order for T provided p is 
consistent with seq, and label, (A) = result(x,preds, (A), for all A € datasteps,{x). This definition says 
that the value seen by each datastep is equivalent to the result of a serial execution in the order given by 
p, where only committed actions have any affect. T is said to be serializable provided there exists some 


serializing partial order for T. 


2.5 Serializability of Augmented Action Trees 


An AAT, T, is serializable iff erase(T) is a serializable action tree. It is convenient to define a 
stronger condition than scrializability for AATs, which we call "data-serializability.". An AAT, T, is 
data-serializable iff there exists p, a serializing partial order for erase(T), with the additional property that 
induced, , is consistent with data,. Obviously if T is data-serializable, then it is serializable. 


Data-serializability has a cycle-free characterization similar to those in usual concurrency control 


Py 


theory. First, we give a definition which says that the label of each access describes the correct object 
value which the access should see, if the versions of objects are ordered according to the data, order. 


Formally, an AAT is version-compatible iff for every object x € obj, and every A € datasteps,(x), it is the 
case that label, (A) = result(x,s), where s = <<v-data,(A); data,>>. The following theorem is proved in 


[Lynch82]: 
Theorem 2.5.1: An AAT, T is data-serializable if and only if both of the following are true: 


a. T is version-compatible. 


b. There are no cycles of length greater than one in seq, U sibling-data,. 


2.6 Restrictions of Trees 


It is often useful to project an action tree (or an AAT) onto a particular set of vertices. We call 


the resulting action summary a restriction of the original tree. 


Defn 2.6.1: Let T be an action tree (or an AAT), V C vertices. We define the restriction of 
T to V, denoted TIV, as follows: (let S = T|V) 


vertices, =V 
WVEV, status,(v) = status,{v) 
VA € datasteps,, label,(A) = label, (A) 


If T is an AAT, then datas = V*M data, 
We say § is a sestriction of T iff S = T|vertices,. We say $ is a subtree of T iff S is a 


restriction of T which is also a tree rooted at U (i.e. vertices. is anc-closed). 


Stating the simplest correctness requirements for executions only requires consideration of 
actions whose cffects become “permanent.” For an action tree (or AAT), T, we define a restriction of T to 
all actions which have committed through the top level: perm(I) = T|visible(U). It is easy to verify 
that perm(T) is a subtree of T. 


The following lemma shows that if an action has no descendants in datasteps,,, then it cannot 
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affect serializability of T: 


Lemma 2.6.2: Let T be an action tree, A € vertices, - {U}. If desc(A) M datasteps; = 2, 
then T is serializable if and only if T\(vertices,. - desc{A)) is serializable. 


Proof: Let T = T(vertices, - desc(A)). 

First we show T serializable = T serializable. Let p be a serializing partial order for T, and 
let p’ be p restricted to vertices... Then p’ is obviously a linearizing partial order for T°. Let 
B € datasteps... 


label,(B) = label,.(B) = result(x,preds; p(B), since p is serializing for T. But desc(A) N 
datasteps, = ,= preds,. p(B) = preds,. p(B). Thus p’ is a serializing order for T. 


Now assume Tis serializable, and Ict p’ be a serializing partial order for T. Let p be any 
linearizing order for T that is consistent with p’. Let B € datasteps;. Then B € datastepsy.. 


label,(B) = label..(B) = result(x,preds,. p(B). since p’ is serializing for T. But desc(A) N 
datasteps,. = 6, = preds,. p(B) = preds, p(B), since p is consistent with p’. Thus p is a 


serializing order forT. § 


We will frequently use trees that are restrictions of the global action tree with the exception that 
the proper ancestors of one action are considered active (instead of whatever status they have in the global 
action tree). We term this process “backing up” an action tree since we are effectively undoing whatever 
commits or aborts of the proper ancestors might have occurred. This construction will be useful for 
defining trees representing the “view” of an action, since the action will believe its proper ancestors to be 


active (whether or not they have already committed or aborted). 


Defn 2.6.3: Let T be an action tree (or an AAT), A € vertices. We define the tree T backed 
up through A, denoted T//A, as follows: (let § = T//A) 


vertices, = vertices, 
B € prop-anc(A) = status,(B) = ‘active’ 
BE vertices, - prop-anc(A) = status, (B) a status_(B) 


VA € datasteps,, label.(A) = label (A) 
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If Tis an AAT, then data, = data, 


Finally, for functions from actions to sets of actions we will occasionally want to exclude some 
actions from the domain of a function. The set of actions excluded will always be the proper ancestors of 


a particular action, so we define exclusion with respect to this action: 


Defn 2.6.4: Let f: act - act). We define the exclusion of f from A 


=_ oe 


denoted f//A, as the 


function: 


(f//A\s) = fis), ifs € prop-anc(A), 
= ©, ifs € prop-anc(A) 


3. View-Serializability 


This chapter presents a correctness condition for action systems, which we call view-serializability. 
The definitions relating to view-serializability are developed using action trees: no specific execution model 
for generating these trees is yet assumed. View-serializability is intended to model “internal consistency:" a 
system which generates only view- serializable action trees will not allow actions to see inconsistent states, 


even if these actions are orphans. 


3.1 External Consistency and Internal Consistency 


A fundamental property of atomic actions is that the effects of their concurrent execution 
should be “equivalent to” an execution where each action is run in isolation, and (if the action commits) 
to complction. Different notions of “equivalence” give rise to different conditions modeling atomicity. 
External consistency of a transaction system requires that for any execution the view of an observer 
outside the system is identical to the view that would result from some serialization of this execution. 
There might be interaction between an action and a user which is outside the scope of the "system" (e.g. 
output to a terminal, which cannot be undone when an action aborts). Since a transaction system can 
only make guarantees about the states of objects under system control, we will ignore the effects of 
“extra-system” communication on serializability. (Insuring consistency in such an environment is the 
responsibility of uscr programs. At this level, "consistency" is an application-specific concept: for some 
applications terminal output from actions which are later aborted might be acceptable, for example.) 
Given this restriction, only actions which commit through the top level can affect the system state as seen 


by an outside observer. 


Internal consistency requires that the effects of concurrency are masked from any action in the 
system. If a system provides external consistency, then all actions which commit through the top level 
must see system states consistent with some serial schedule. Other actions might see inconsistent states, 
however. In particular, the views of orphans are not considered for external consistency, since orphans 


cannot commit through the top level. 


We model external consistency by requiring that perm(T), the subtree of the action tree 
consisting of all actions which commit through the top level, be serializable. In [Lynch82], a model for a 
distributed transaction system based on the locking protocol developed in [Moss81] is shown to be 
externally consistent: Lynch shows that for all action trees, T, generated by the model, perm(T) is 
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scrializable. 


To see that serializability of perm(T) is not sufficient to guarantee internal consistency, consider 
the example from Fig. 1.1 The consistency constraint x = y is violated for action A2, but perm(T) (which 
consists of U, B, B1, and B2) is serializable. 


Although serializability of perm(T) is not sufficient for internal consistency, serializability of the 
entire action tree is not necessary for internal consistency. We can easily construct action trees for 
executions which we believe are internally consistent (since no action can see an inconsistent state), but 
which are not serializable. Consider the example shown in Fig. 3.1. Again, the integrity constraint on 
the system state is x = y. Initial values of x and y are 0. Action B1 runs first, views x = 0, and then 
aborts. Then actions Al,A2,B2, and B3 run (in that order). Al and B2 increment x, and A2 and B3 
increment y. The tree is not serializable, because A must be serialized before B (since B2 views x = 1), 
yet Bl did not view the effect of Al. The tree is internally consistent, however, because no particular 
action was able to observe x # y. (B1 viewed x = 0, but it had no information about the value of y. Since 


B] aborted, it did not pass its view of x to the rest of B.) 


Thus serializability of the entire action tree is too strong a condition for internal consistency. 
We need a weaker condition which takes into account the views of aborted actions and orphans as well as 


the views of actions that commit through the top level. 


In the following sections we will define the possible "views" of cach action in an action tree, and 


we will state a condition modeling internal consistency which is based on serializability of these views. 


Fig. 3.1. Non-serializable, Internally Consistent Action Tree 


A,c B,c 
Al, A2,c Bl,a \,,. 
x,0 y,0 x,0 x,1 y.1 
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Using this definition, the only view for action U will be perm(T); thus our formal "view" for action U 
corresponds to our intuitive notion of the view of an “outside observer." Our condition for external 


consistency will then be a special case of our condition for internal consistency. 


3.2 Information Flow and Information Trees 
3.2.1 Information: Object Values and Execution Histories 


Internal consistency requires that any action’s view of the system state must be consistent with 
an "illusion" of serial execution. To formalize internal consistency, we must attempt to be precise about 
what constitutes an action’s "view" in a particular system state. A simple approach would try to capture 
the knowledge that an action has of the current values of objects. Thus for the example in Fig. 1.1, we 
might say that action A2 knows that x=0, and if A2 is allowed to read y then it will know that x=0 and 


that y=1 (an “inconsistent” view). 


A definition which describes the view of an action as a (partial) binding of objects to known 
values is not sufficient to handle more complex examples, however. Suppose that action A creates 
concurrent children Al and A2 to read and update object x. x is a boolean object, assuming only logical 
values (0 and 1). Both Al and A2 read x, and perform a logical not operation on x. Al returns the value 
0, and A2 returns the value 1. If A cannot determine which child ran first, then it is unsure of the 


“current” value of x. 


This uncertainty about "current" values can affect our notion of “consistency.” Suppose, for 
example, that action C creates child Cl to read object x, and Cl returns the value x=1. C then creates 
concurrent children C2 and C3, passing them the “information” that x= 1. But C2 and C3 both read and 
increment x. Depending on which action runs first, the later one will sce an "inconsistency" between 
what its parent told it (x=1) and the current state (x=2). But if both C2 and C3 realize that the other 


might have run first, then both can explain this potential "inconsistency." 


These examples illustrate that direct information about the “current” value of an object is only 
available to accesses which directly read that object. All other information is “hearsay,” in a sense, 
because it expresses only what another action saw or was told. We thus regard the “information” 
available to an action as its knowledge of the execution history of the system: an action might know with 


certainly that action B read y=5, but it cannot automatically assume that the value of y is 5. By treating 
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information as information about execution histories, we can explain the seeming ambiguities and 
conflicts described in the examples above. For the first example, A’s information is that Al read x=0 
and A2 read x= 1. In the second example, the information available to C2 and C3 is that “C1 ran and saw 
x= 1." Neither C2 nor C3 can conclude that the current value of x is 1. If C2 were run sequentially before 
C3 (and C had no other children), then C2 could conclude x=1. This conclusion of C2 depends on a 
serializability assumption, which is a basic part of consistency, and on C2’s knowledge of the structure of 
other actions (in this case knowledge that no siblings can intervene between Cl and C2). We elaborate on 


these points in the following sections. 


3.2.2 Paths of Information Flow 


In designing system algorithms to guarantee consistency, we often take a “worst case” approach 
regarding information flow among actions. To define an action’s view in this sense, we must consider ail 
possible sources of information about the exccution history to an action. We say that information flows 

. from action A to action B if B learns something about the execution history from A. The actual value(s) 
passed from A to B will generally be some function of the values of objects seen by A; we lose no 
generality by assuming that A passes B its complete knowledge of the execution history. Again, this 
assumption amounts to a worst-case approach for information flow: If action A reads object x, and A 
passes some information to B, B does not necessarily have specific “information” about the value of x 
seen by A. The actual values passed from A to B might be constants, for example, giving B no 
information at all about the exccution history. But since B might have any information that A might have 


had, we will assume that it does. 


Let A be an action whose view is being defined. We imagine that actions are encapsulated in 
procedure-like structures, with well-defined inputs and outputs. Thus we assume that information can 
flow to A only in the following three ways: 

1. If A is an access to x (and A commits), then A reads the value of x. 
2. parent(A) passes information to A when A is created. 
3. Committed children of A pass information to A when they return (i.e. when they commit to 


A). 


Path (3) is limited to committed children, reflecting an assumption that aborted children do not 
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pass “information” to their parents. If aborted children are allowed to return values to their parents (as in 
Argus), then this assumption can be violated. In Argus, return values from aborted children are a 
recognized “loophole” in the system. We retain our assumption because it models the fundamental 
semantics of “abort” which are derived from atomicity: An atomic action runs completely or not at all. If 
an atomic action aborts, all effects should be as ifit had never run at all, and an action which never runs 


cannot return values. 


A more subtle assumption is that the very fact that a child has aborted cannot give the parent 
any “information” about the execution history, other than the fact that the child aborted. A child which 
reads object x might be programmed to commit if it sees x = 1, for example, and to abort otherwise. If the 
child aborts, one might think that the parent could then assume that the child read x and found x#1. 
However, we make a basic assumption that an action can be aborted at any time by the system, and that 
the parent cannot necessarily distinguish between a system-initiated abort and an abort caused by the 
child itself. For example, the system might abort a child because of a communications failure, even if the 
child were going to commit. (In a practical system, such as Argus, it might be useful to identify the cause 
for a system-initiated abort, so the parent will know how to procecd. These explanations for aborts fall 
into the same “loophole” category as return values from aborted children.) Given the assumptions that 
aborted children cannot return values, and that aborts are always possible, whatever the system state, 


aborts serve as impenetrable barriers to information flow. 


3.2.3 Circularity of Information Flow 


We would like to describe the information available to an action in an action trec by listing all 
the actions which are (potential) sources of information to that action. Our formulation of the three paths 
of information flow is not convenient for this purpose, because it contains a confusing circularity: 
information flows from a parent to its children, and also from a (committed) child back to its parent. By 
naively following the paths of information flow we would conclude that an action is a source of 
information to itself, which makes no sense. Of course this circularity is fictitious, because the flow of 
information from parent to child happens at a different time than the flow of information from child to 


parent. 


One approach based directly on the three paths of information flow above would be to define 


the information available to an action as a function of time. By including time as a parameter of available 
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information, the circularity described above can be removed (i.e. “available information” will no longer 
be recursively defined). We would like to describe the information available to an action without 
. referring to time, however. Although the information available to an action does change as an execution 
proceeds, we would like to capture the maximum amount of information that an action sees during an 
execution. Since an action’s information can only increase over time, an action attains its maximum 
information at its /atest active point in an execution (if it completes, this point is immediately before it 


commits or aborts). 


We achieve a “time-independent” definition of available information below by reformulating 


the paths of information flow. The alternate formulation contains no circularity. 


3.2.4 Information Flow from Siblings of Ancestors 


We remove the circularity in the paths of information flow by "short-circuiting” flow through 


ancestors: 


Information can flow between sibling actions via the parent only if one sibling commits before 
another is created: upon commit, the first sibling passes information to its parent, and the parent passes 
this information to the next sibling when creating it. (There can also be indirect information flow via 
objects.) In some systems, this path of information flow might allow an action to see information known 
by any sibling which had committed before the action was created. We assume that flow of information 
between siblings (via the parent) is restricted to flow from sequentially preceding committed siblings. We 
are making an assumption here that the control structure of actions does not permit direct flow of 
information between concurrent siblings. (This assumption holds in Argus, because all concurrent 
siblings must be created “at once” by a coenter statement. It is impossible for a concurrent sibling to 
commit before another is created; thus it is impossible for information to flow directly between them. 


Concurrent siblings cannot communicate except by modifying shared objects.) 


Thus we can list the the sources of information to A’s parent which can serve as sources of 
information to A (when A is created): (1) Sequentially preceding committed siblings of A, (2) Any action 
which was a source of information to A’s parent when the parent was created. In “unwinding” this 
recursion, we can define the sources of information to A when A is created as all actions which are 
committed and sequentially precede some ancestor of A. We thus obtain an equivalent definition of the 
sources of information to an action by replacing (2) above with a path of information flow from these 


sequentially preceding committed siblings of ancestors: 


2. Information passed from B to A, where B has committed, and B sequentially precedes some 
ancestor of A (BE v-ane-seq,{A)). 


Using this second formulation we can give a single definition of the (maximum) information 
available to an action in any particular execution history (i.e. for any action tree). With the new 
specification of information source (2), the only paths of information flow are from committed actions 
(and from objects). We assume that committed actions release their complete (maximum) information to 


other actions when they commit. 


3.2.5 Information Trees 


Since we are using action trees as an abstraction of execution histories (and hence of system 
States), we describe an action’s view of the history as a particular (backed up) subtree of the (global) 
action tree. We call this tree the information tree for an action. We can think of the information tree for 
action A as being defined recursively: it is constructed by merging all the information trees of actions 


from which information can flow to action A. 


Because an action might be aware that some actions have aborted, these aborts should strictly be 
included in the information tree. (If action A sequentially precedes B, for example, then B will know that 
A has either committed or aborted.) Although aborts are part of the execution history, we have argued 
above that they convey no additional information. (In other words, the existence of an abort tells an 
action nothing other than that the abort occurred.) For simplicity, then, we exclude these aborted actions 
from the information tree. 


The vertices of the information tree for an action are simply all vertices reachable by “tracing 
back” the three paths of information flow listed above. Since the information tree is a subtree of the 
global action tree, path (1) is accounted for by the labels of datasteps. (In other words, if a datastep is 
labeled with "u" in the global action tree, it will be labeled with “u" in the information tree. This value 
read is part of the execution history of the datastep, and should thus be included with the datastep.) Path 
(2) requires that if B is in the information tree, and C € v-anc-seq,(B), then C is in the information tree. 
Path (3) requires that if B is in the information tree, and C is a committed child of B, then C is in the 
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information tree. 


Defn 3.2.5.1: Let T be an action tree, A € vertices. We define the information set of A in T, 
info-set,(A) = (v-anc-seq, U v-child,) (A) 


info-tree (A) = (T[W)//A, where W = info-set,{A) U prop-anc(A) 


We include proper ancestors of A in the information tree, but since information has only flowed 
through these ancestors from sequentially preceding committed siblings, we do not include them in the 
information set. The proper ancestors are considered active since A will regard them as active. (Thus the 
information tree is “backed up" through A.) It is possible that some of these ancestors might have 


committed or aborted, but these changes in status should not be visible to A. 


The following lemma gives an equivalent definition of the information set which is easier to use 


because it does not involve closures of functions. 
Lemma 3.2.5.2: Let T be an action tree, A € vertices. Then 
info-set,(A) = v-desc{v-anc-seq,(A) U {A}). 


Proof: Let V = info-sct,(A) = (v-anc-seq, U v-child,)'(A), and let W = 
v-desc,(v-anc-seq,{A) U {A}). It is obvious that W C V. We show V C W by induction on 
V: 


Basis: A € V, but A € W because A € v-desc,({A}). 


Induction: Let B € V, and assume B € W. Take C € v-child(B) U v-anc-seq,({B). We show . 
CEew. 


Since B € W, B € v-desc,(B’), for some B’ € v-anc-seq,{A) U {A}. IFC € v-child,(B), then 
Ce v-desc,(B'). IfC € v-anc-seq,(B), then either C € prop-desc,(B’), or (C,B’) € siblings, 
or C € v-anc-seq.(parent(B’)). 


If C € prop-desc,(B’), then C € v-desc,(B’). 
If (C,B’) € siblings, then (C,B’) € seq, = CE v-anc-seq,(A), by transitivity of seq. 
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‘Af C € v-anc-seq,(parent(B’)), then C € v-anc-seq,{A). | 


We now use this cquivalent definition of the information set to prove three simple lemmas 


about information sets and information trees: 


Lemma 3.2.5.3: Let T be an action tree, A € vertices,, W = info-set,{A) U prop-anc(A). 


Then W is ancestor-closed. (Thus the information tree is in fact a tree.) 


Proof: Let B € W, C € prop-anc(B). We must show C € W. Let V = info-set,(A). If B € 
prop-anc(A), then C € prop-anc(A) = C€W. 

IfFBE V then BE v-desc.{v-anc-seq,(A) U {A}), by Lemma 3.2.5.2. If B € v-desc({A}), 
then C € v-desc,({A}) U prop-anc(A) = C€ W. 

IfBE v-desc.,(v-anc-seq,(A)), then either C € v-desc,(v-anc-seq,(A)), or C € anc(A), = C 
ew. § 


Lemma 3.2.5.4: Let T be an action tree, A € vertices,. Then prop-anc(A)M info-set,(A) = 
D. 


Proof: Follows directly from Lemma 3.2.5.2. | 


Lemma 3.2.5.5: Let T be an action tree, A € vertices,, and let S = info-tree,(A). Then 
vertices, € visible,{A). 


Proof: Follows directly from Lemma 3.2.5.2. | 


3.3 Behavioral Constraints and View Trees 


The information tree represents all information about the execution history which might be 
available to a particular action as a result of information flow in this execution (except for information 
about aborts.) For this information to be “consistent,” it must not contradict the assumptions an action 
might have about the system’s behavior. One of these assumptions is the illusion of serial execution: no 
action should sec the effects of concurrency. Failure atomicity also requires that no action should see the 
effects of aborted actions. An action might have additional expectations about the system’s behavior, 


- 43- 


however. Often these expectations are captured in invariants on the system state which all actions 
preserve (when run in isolation and to completion). An action might function correctly only if a 
particular invariant holds. (Its effects when the invariant does not hold might be unexpected or 


unspccified.) 


To develop a notion of "consistency," we imagine that an observer is placed at an action and is 
given that action’s information tree. The observer is also informed of any invariants on the system state 
that are preserved by all actions in isolation, and he is told that the system exccutes actions in some serial 
order. (Of course, the actual order might not be serial, but the observer should be unaware of this 
interleaving.) There are two types of inconsistencies which he might find: (1) The observer sees the 
effects of concurrency. For example, action A spawns child A] to read x (no update), and finds x = 1. 
Then A spawns child A2 (sequentially following Al) to read x, and A2 returns x = 2. (A has no other 
children.) This situation is clearly inconsistent with serializability. (2) The observer might deduce that 


the system state violates an invariant. For example, an observer at action A2 in Fig. 1.1 would see x # y. 


The first type of inconsistency can be prevented by requiring that the information tree be 
serializable. Serializability of the information tree is too strong a condition, however, because the effects 
of other actions might be visible (through data objects) even though these actions are not in the 


information tree. 


. Since we want to formulate a consistency condition which does not depend on particular 
invariants for particular applications, we will increase the amount of information we presume is available 
to an observer. In other words, we will provide a sufficient consistency condition, which might not be 
necessary to insure consistency in all cases. We now assume that an observer at an action has complete 
knowledge of the set of possible behaviors of all other actions in the system (when run in isolation and to 
completion). We might imagine that the observer is given program listings for all actions, for example. 
This knowledge is sufficient to determine any invariants. (In a sense invariants are just one way of 
specifying certain aspects of program behavior.) Other than the actions in his information tree, he does 
not know what particular actions have actually run in the current execution, but if he is told that a 


particular action did run he can deduce the possible effects that it had (by checking his program listings). 


The observer's view is consistent if he can explain the valucs in his information tree with a serial 
execution that conforms to the known behaviors of all actions. We stress again that the observer does not 


know what actions have run, but he can construct hypothetical execution histories based on his program 


listings. This condition is existential: an information tree is consistent if there exists a scrializable “view 


tree" which contains the information tree and agrees with known behaviors of actions. 


The problem with a condition defined in terms of program behaviors is that the éransaction 
system does not have the program listings available to it (in a useful form). We imagine now that a 
"transaction manager" is placed at an action, and given its information tree. The transaction manager 
must decide whether the information tree is “consistent.” The transaction manager will design algorithms 
to insure that an observer does not see an inconsistent state, but the manager does not have access to the 
program listings. But the transaction manager can devise a sufficient test for consistency: Since every 
action must run according to its program, the actual behavior of any action in the current execution must 
be among the allowed behaviors. Thus the transaction manager will try to create a “view tree” by taking 
actions from the real global action tree. (Of course, the observer cannot sec this global tree.) Another 
way of looking at this restriction is to imagine that the program listings given to the observer are modified 


so that the only possible behavior of an action is the behavior it exhibited in the current execution. 


The known behaviors of an action might include aborted actions as well as committed actions. 
For example, action B might run child B2 sequentially after child B1 in every execution. If B2 runs it can 
conclude that Bl has either committed ae aborted. Moreover, if B commits, any other action can 
conclude that B] committed or aborted, and that B2 committed or aborted. Note that if B aborted, then 


another action cannot conclude anything about B] or B2 (since they might never have run at all). 


Strictly speaking, the transaction manager should include these known aborts in its view trees, 
because they are part of “behavior.” Just as we argued that there is no need to include these aborts in 
information trees, we can arguc that there is no need to include them in view trees: Since aborted actions 
provide no information about their proper descendants, these proper descendants need not be included in 
the view tree. But aborted actions without descendants cannot affect serializability (by Lemma 2.6.2), so 
it is sufficient for the transaction manager to test for a serializable view tree which does not include these 
. known aborts. It suffices for the transaction manager to choose actions for the view tree which are visible 
to A. (In other words, if a serializable view tree exists which includes these aborted actions, then it will 
still be serializable when the aborted actions are deleted. Thus we lose no generality by considering only 


view trees which do not contain these aborted actions.) 


We place two restrictions on the selection of actions for this hypothetical view tree. First, the 


transaction manager must choose actions that are visible to the action whose tree he is constructing. 
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Second, because the behavior of an action might depend on any information available to it, if the 
transaction manager includes any action in his view tree, he must include the entire information tree of 


that action. 


Example: We consider again the scenario presented in Fig. 1.1. Suppose that except for 
action A, the top level actions in the system each create two (sequentially related) subactions; 
the first subaction reads and increments x, and the second subaction reads and increments y. 
(Action A simply reads x and then reads y.) The initial values of x and y are 0. The 
information tree for A2 from the tree in Fig. 1.1 indicates that x = 0, y = 1. If the transaction 
manager were allowed to create a view tree which included only part of action B (i.e. included 


descendant B2 but excluded B1), he would conclude (wrongly) that A2’s view is consistent. 


Note also that the status of proper ancestors of the action should be ’active’, since the observer 
should be able to believe that its proper ancestors are active (though in fact they might have committed or 
aborted). We include these proper ancestor in the view tree, but we exclude them from the information 
set closure requirement because (as discussed in the section on information trees) we have short-circuited 
these ancestors with our definition of information flow. (Thus we require the vertices of the view tree to 


be info-set,//A-closed, rather than in fo-set,-closed.) 


For convenience, we separate the scrializability requirement from the other requirements, and 


we define a view tree as any tree which satisfies the proper closure properties. 


Defn 3.3.1: Let T be an action tree, A € vertices;. Let S = (T]V)//A, for some set V C 
vertices,. We say S is a view tree for A in T iff 

LAEV 

2. V is anc-closed 

3. V is info-set,//A-closed 

4. VC visible (A) 


_ (Note that A € V and Vis info-set,//A-closed = info-tree,(A) is a restriction of S.) 


It is important to stress again that there is not enough information in action trees alone to 


determine the view tree for an action: a view tree is one of possibly several explanations for an 


information tree. As a trivial example, suppose that actions Al, A2, and B read object x in this order, but 
never update it. (See Fig. 3.2.) Then any combination of actions that includes B forms a serializable view 


tree for B. 


For AATs, we will define a particular view tree, using the data ordering. To conclude that this is 
necessarily the view tree is incorrect: use of this particular view tree requires assumptions about how 
versions of objects are modified. We will use this view tree for one of our system models, but again note 


that the definition of a view trce is independent from the construction of this particular view tree. 


Fig. 3.2. Multiple View Trees 


Global action tree: 


is 
fs. B.c 


x,0 x,0 x,0 

One view tree for B: U 
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3.4 View-Serializability 


The view of an action is “consistent” if it can be explained with a serializable view tree. An 


entire tree is “view-serializable" if every action has a serializable view tree. 


Defn 3.4.1: Let T be an action tree. We say T is view-serializable provided the following is 


true: For each A € vertices, there exists a serializable view tree for A in T. 


View-serializability is our basic correctness condition modeling internal consistency. In fact 
view-serializability is a strong enough condition to model external consistency as well: The following 


lemma shows that perm(T) is the only possible view tree of the (virtual) top-action, U. 


Lemma 3.4.2: _ Let T be an action tree. Then S is a view tree for U in T if and only if S = 


perm(T). 


Proof: Suppose that S = (T|V)//U is a view tree for U in T; we show that S = perm(T). But 
prop-anc(U) = @ = S=TIV. 

Since V is info-set,//U-closed, V is info-set,-closed (again, because prop-anc(U) = @). 
Thus V is v-child closed, = v-desc,(U) € V (since U € V). 

But v-desc(U) = visible,(U) = visible,(U) CV. 

But VC visible (U), since S is a view tree for U. 

Thus V = visible,(U), = S = Tlvisible,(U) = perm(T). 


Conversely, let S = perm(T); we show S is a view tree for U in T. As above, S = 
(Tlvisible,(U))//U. Let W = visible,(U). We show that W satisfies the correct closure 
properties for view trees. 


1. UE W, since U € visible ;{U). 


2. If A € W, then anc(A) - {U} € committed... If B € anc(A) - {U}, then anc(B) - 
{U} ¢ committed, =» B € W. If B = U, then B € W by (1) above. Thus W is 
anc-closed. 


3. We show that W is info-set,//U-closed, ie. if A € W - {U}, and B € 
info-set,{A), then BE W. ButA€ W-{U} = AE visible(U), by definition. B 
€ info-set,(A) = B € visible,(A), by Lemma 3.2.5.5. Thus B € visible (U) by 
Lemma 2.2.3.1c, =» B € W. 


4. W C visible,(U) by definition. 


Thus view-serializability implies serializability of perm(T); our condition for external 


consistency is covered by our condition for internal consistency. 


Lemma 3.4.3: Let T be an action tree, then 


T is view-serializable = perm(T) is serializable. 


Proof: Immediate from Lemma 3.4.2. | 


3.5 Augmented Action Trees and Data-closed View Trees 


We extend all definitions and lemmas for information sets, information trees, view trees, and 
view-serializability to AAT’s in the obvious way (by applying them to erase(T)). (There is a subtle point 
that the definition of restriction of an AAT is different from the definition for an action tree, since a 
restriction of an AAT includes the data ordering from the original AAT. But the data ordering does not 
enter into any of the preceding definitions or lemmas, and erase(T)|V = erase(T|V) for all AAT’s, T, and 
action sets, V.) 


For AAT’s we define a particular view tree by augmenting the information tree via a type of 
data-closure. For the models that we will consider (in which only explicit aborts are allowed, and versions 
of objects change only in response to explicit commits and aborts), this view tree will be used to show 


view-serializability. 
Defn 3.5.1: Let T be an AAT, A € vertices, Define ytree (A) as follows: 


Let V= vset,(A) (= v-precedest(A)) 


The components of S are as follows: 


- vertices, = V U prop-anc(A) 
~- Status, is defined by 


1. BE V- {A} = status,(B) = ‘committed’ 
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2. status.(A) = status (A) 
3. A’ € prop-anc(A)- V == status,(A’) = ‘active’ 


_ + IfB € datasteps,, then label,(B) = label, (B). 
~- data, = data, M vertices,” 


‘ ; 
wie Die SEE . 


Unlike the situation for information sets, the view set of an action might include proper 
ancestors of that action. (This case occurs only when the view set “cycles back" to ancestors of the action; 
proper ancestors are not originally included in the view set.) The following lemma shows that vtree(A) 


is a view tree for A if these cycles do not occur: 


Lemma 3.5.2: Let T be an AAT, A € vertices, S = vtree,(A). If prop-anc(A) M vset,{A) = 


@, then S is a view tree for A in T. 
Proof: Let V = vset,(A), W = VU prop-anc(A). First we show that § = (TIW)//A. 


By definition, vertices; = V U prop-anc(A) = W. If B € V - {A}, then status(B) = 
‘committed’. But by Lemma 2.3.2, status (B) = ‘committed’. For B € prop-anc(A) - V, 
status,(B) = ‘active’, by definition. But prop-anc(A)M V = @ = prop-anc{A)- V = 
prop-anc(A). Thus B € prop-anc(A) = status,(B) = ’active’. For action A, status,(A) = 
status,(A), by definition. Thus the trees S and (T]W)//A agree on the status of all actions. 


It is trivial to verify that these trees agree on all labels, and on the data ordering. 


Now we show that W satisfies the correct closure properties for view trees: 
1. AE W by definition 


2. W is anc-closed by Lemma 2.3.4. 


3. We show that W is info-set,//A-closed, ie. that (info-set,//AXW) © W. But 
by definition, info-set,//A is @ on prop-anc(A), and is identical to info-set, 
otherwise. Thus we must show info-set,(V) C W. 


But V = vset(A) = V is vset,-closed by Lemma 2.3.3a. 


vset, = vprecedesy = (v-anc-seq, U v-child, U v-data-anc,)*. But 
info-set, = (v-anc-seq, U v-child,)*. 


Thus info-set,(V) € vset,{V). 
Thus V is vset;-closed =» vset,{V) C V, = infoset(V) CV, = 
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info-set,(V) CW. 


4. V € visible.(A), prop-anc(A) € visible,(A), 
= WC visible,(A). 


We will show in Chapter 6 that these cycles can only occur for view sets of orphans, and that the 


orphan detection strategy which we present will climinate these cycles. 


As examples of the construction of these data-closed view trees, vtrec,(A2) for the ree of Fig. 
1.1 is the entire tree, and it is not serializable. For the tree of Fig. 3.2, vtrec.(3) is also the entire tree, but 


it is serializable. 


<8] 


4, Event-State Algebras 


This chapter defines our basic execution model: the event-state algebra. An event-state algebra is 
a State-transition model of a system where events can occur asynchronously. A correctness proof for an 
event-state algebra shows that the states generated by valid event Sequences satisfy some property. A strategy 
of hierarchical correctness proofs is explained: We define mappings between event-state algebras, and we 
give conditions on these mappings which insure that they preserve validity of event sequences. Finally, we 


present a model for distributed systems which is a special case of event-state algebras. 


4.1 Event Algebras 
4.1.1 Notation 


If S is a (finite or infinite) set of symbols, then S* denotes the set of finite sequences of symbols from S, 
including A -- the empty sequence. We will often drop the distinction between a symbol and a sequence 


of Iength one. 
N denotes the set of non-negative integers, and |u| € N denotes the length of sequence u. 


If sequence u is a prefix of sequence v, then we write u < v. (Context will dictate whether "<" refers to 
the prefix relation on sequences or to numcrical order on integers.) We say a set of sequences, W, is — 
Prefix-closed if and only if all prefixes of every sequence in W are also in W: (Vv € WXu v= u€ 
W). 


If u € S* is a sequence, and e € S, then we write e € u iff e is among the elements of u. (Note that, a 
priori, e might be repeated in u many times.) We denote by ? the ordering on elements of u, i.e. if ef 


€ S, then 


e-> fe u = arerbefec, for some sequences a,b,c € S*. 


Note that > is transitive for any u € S*. It is not necessarily acyclic, since elements of a sequence can be 


repeated. 
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4.1.2 Events and Valid Execution Sequences 


An event algebra is a behavioral model of a system which describes the events in the system, and 
some constraints on “valid” executions imposed by the system. An execution of a system is any sequence 
of events from the system; the valid execution sequences will be some subset of these sequences. This 
type of model is useful for describing systems where events occur asynchronously and independently (as 
opposed to a program model, for example, where the allowable sequences of events are governed by a 
(generally sequential) program). It is also useful for describing properties of sequential systems which do 
not depend on the order of events (or depend on weaker ordering constraints than those enforced by the 


system). 


At this level of description, "events" are completely uninterpreted: they should be regarded as 
textual symbols only. The only structure imposed by an event algebra is the set of valid execution 


sequences. 


Defn 4.1.2.1: An event algebra is a pair 
A= (1 


where & is a set (called the events of A), and 
Tis a prefix-closed subset of 8" (called the valid execution sequences of A). 


(We will generally use symbols “e,f.g" to refer to individual events, and "u,v,w" to refer to sequences of 


events.) 


We can consider the gencral problem faced in reasoning about a system to be showing some 
properties of the valid execution sequences. We are not interested (at this level) in how the system 
enforces the constraints on execution sequences. The valid execution sequences are simply a specification 


of the "correct" behaviors of the system. 
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4.1.3 Interpretations 


We would often like to view a system at a higher level of abstraction than the one at which it is 
defined. In this section we describe an abstraction process for event algebras, and we show how this 


process can be used to organize proofs of system properties. 


: Defn 4.1.3.1: Let A, = (6), 1%), and A, = (6,, 1%) be event algebras. An interpretation 
from A, to A, is a mapping h: 8, > 8). 


An interpretation, h, is valid iff h(%}) Cc Y. 


Note that any event sequence in one algebra can be interpreted as any sequence in another 
algebra: there are no constraints on this mapping. Although most interpretations of interest will have 
more structure (for example, h might be monotonic), it is not necessary to introduce this structure for 


these general definitions. 


In proving a property of valid execution sequences for some event algebra, it might be useful to 
state this property as a constraint on execution sequences of an event algebra which is at a higher level of 
abstraction than the low-level model of the system of interest. (We might be interested only in particular 
events, for example, or we might regard a sequence of events as a single event at a higher level.) We 
might also want to break this abstraction process into several steps, constructing event algebras at 
intermediate levels of abstraction. We must then define valid interpretations between successive levels. 


Soundness of this technique follows directly from the following lemma: 


Lemma 4.1.3.2: Let A,, A,, A, be event algebras. If g is a valid interpretation from A, to 
A,, and h is a valid interpretation from A, to A,, then hog is a valid interpretation from A, to 
A, 


Proof: Straightforward. | 
Of course, we must be careful in applying this technique to be sure that the composition of 


mappings from lower-level algebras to higher-level algcbras is consistent with the abstraction we desire 


from the lowest-level event sequences to the “abstract” event sequences. 


We can reduce any problem of proving a property of valid execution sequences to an equivalent 
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problem of constructing a valid interpretation: Suppose A, = 6, Y) is an event algebra, and P C g, is 
some property of execution sequences. We want to show that P holds for all valid execution sequences in 
A, ie. that % C P. We can construct a “higher-level” algebra, A,, whose valid exccution sequences are 
just those specified by P: A, = (6,, P). If we define interpretation h from A, to A, as the identity map 


on event sequences, then y C P ifand only if h is valid. 


By defining a top-level event algebra whose valid execution sequences automatically satisfy a 
desired property, we create a very uniform structure for our proofs: A "correctness" proof consists of 
definitions for a sequence of algebras, definitions for interpretations between levels, and proofs that all 


interpretations are valid. 


4.1.4 Event-Homomorphic Interpretations 


We defined interpretations very gencrally as any mapping between event sequences. Usually 
natural interpretations will have more structure, which will simplify a proof of validity. We define here a 
class of interpretations called “event-homomorphic" which allow the interpretation of any sequence to be 


constructed inductively from an interpretation of each event in the sequence. 


Defn 4.1.4.1: Let 4, = (6).%) and A, = (6, %) be event algebras, and h: 8; —> 8) be 
an interpretation from A, to A,. We say h is an event-homomorphic interpretation iff 


Vu,v € 8), h(uv) = h()h(v) 
(Note that if h is event-homomorphic, then h(A) = h(AA) = h(A)h(A); thus h(A) = A.) 


If an interpretation, h, is event-homomorphic, then the image of any sequence can be 
constructed from the images of each element in the sequence. Thus we can specify an 


event-homomorphic interpretation as a mapping h: 6, > 8, : 


Note that for an event-homomorphic interpretation, individual events in the lower-level algebra 
can be interpreted as any sequence of events in the higher-level algebra. A lower-level event which maps 
to A is effectively “abstracted out" at the higher level. A lower-level event which maps to a single event is 
. visible at the higher level, although different lower-level events might map to the same higher-level event. 
To model the usual notion of “abstraction,” where several "concrete" events might implement a single 


“abstract” event, we could map the earlier steps of the concrete sequence to A, and map the last step to 
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the abstract event. 


Our notion of "abstraction" is unusual, however, in that the image of a single lower-level event 
might be several higher-level events. We allow this case because the “observer” of a system might not be 
able to see the granularity of events directly: he might only see their effects (e.g. through changes in 
"state" caused by events). An “abstraction” in this sense might be a higher-level way of explaining these 
effects. It is possible that higher-level events can be “simpler” to understand, even though they are less 


“powerful” in that several higher-level events are needed to explain a single lower-level event. 


We will deal only with event-homomorphic interpretations; in the remainder of this paper, 


“interpretation” always means “event-homomorphic interpretation." 


4.2 Event-State Algebras 
4.2.1 Events as State Transitions 


Although our notion of the behavior of a system depends only upon the events in the system 
and the valid execution sequences, it is often convenient to describe a system by referring to a “system 
state." Specifically, we can abstract from event sequences to “states” by interpreting events as operations 
on a state. We introduce a structure called an “event-state algebra," which includes state as a basic 


system component. 


Following [Stark83], we regard the events in a system as the fundamental entities; we introduce 
States for convenience in specifying the valid event sequences. The concept of “state" allows us to 
describe valid event sequences inductively by giving “preconditions” on the current state for each event. 
Because it is often simpler to reason incrementally about system behavior, states are a useful specification 
device. From this perspective, a system could be described (equally well) by several event-state algebras 
using different state spaces; these different state spaces would simply represent different ways of 


summarizing execution histories. 
Defn 4.2.1.1: An event-state algebra is a quadruple 
A= (, 2, 0,1) 


where & is a set of events, 


2 is the set of system states, 
o € Xis the initial system state, and 
+C& X = &X Zis the transition relation. 


Let r(e) = {(s,t) € 22: (e,s,t) € r}. 
For convenience, we require that +(e) be a partial function on %, ice. 
(e,s,tl), (e,s,2) Er =» tl = 12. 
(We could allow r(e) to be an arbitrary relation, modeling a nondeterministic choice of the 


“next state.” Because we will not need this power, we restrict r(e) to a partial function.) 


We regard +(e) as a total function on Z U {_L} (where _1 represents “undefined") by 
defining 

+(eX_1) = 1, and 

r(eXs) = _L fors€ = if there is no pair (s,t) € r(e). 


If s€ ZU {1} and e €&, then we write 
se for r(eXs) 


We generally drop the distinction between the event e and the partial function r(e) when the 


meaning is clear. We extend our notation to sequences of events in the obvious way: 
(sXe,e,...€,) = ((s)e,)e,).--€,) 


If u€ a then we say ou is the result of execution sequence u. (Note that the result might 
be 1.) 


If 3u € g*: s2 = (sl)u, (for sl, s2 € 2) then we write sl F s2in A, and we say §2 is 
reachable from s] in A. We will simply write sl] t- s2 when the algebra is clear from 


context. 
fHC 8" ands € =, then we define 
sH = {su: u € H} 


(Similarly if SC Dand u€ 8" we define Su, or ifS C 2 and HC 8", we define SH.) 
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HA), the set of valid execution sequences of A, consists of all sequences whose result is 
defined (i.e. each event in the sequence is defined on the result of the preceding sequence): 
e€ HA) & oe # L 


%(.A), the set of reachable statcs in A, is the set of all states that are reachable from the initial 
State: 


RA) = {sE Zia F s} (Thus K(A) = oNA).) 

We extend this definition to sequences of reachable states as follows: 
BMA) = {E5,,Sy08,9 € Bo Fs, FS, 8} 

Note that %(™)4) € (B(A))". 

We will use boldface symbols to refer to vectors of states, e.g. s = €S}.89)--8,>- 


We denote by PRE ,(e) the proper domain of r(e), for each e € & (PRE ,(e) = {s € 2: 
t(eX(s) # _.}.) We generally drop the subscript when the algebra is clear from context. We 


extend this notation to sequences u € g* by defining: 
PRE(u) = {s € Z: su # _L} 


(In general, if an event-state algebra is named "A," for some subscript, "n", then we will 
abbreviate WA,)as"1,", (A) as"%,,", and PRE, as "PRE,.”) 
n 


We are viewing event-state algebras as convenient structures for specifying event algebras. We 


say that an event-state algebra A’ = (8’, >’, 0’, 7’) is a presentation of event algebra A = (6, %) if and 
only if €& = & and (A4’) = & (Note that 1(4’) must be prefix closed by construction.) It follows from 


this definition that several event-state algebra presentations might exist for a given event algebra, but each 


event-state algebra is a presentation of a unique event algebra. If 4’ is a presentation of A, then we say 


that A is the embedded event algebra for A’. 


We can show that an event-state algebra presentation exists for any event-algebra -- the 


degenerate presentation whose state is the entire execution history: 


Lemma 4.2.1.2: For any event algebra A = (6, 1%), there exists an event-state algebra 


presentation of A, 


Proof: Let A’ = (8&’, 2’, o’, 7’), where 
&=6,2 = 3 o = A,and 7’ = {(e,u,ue): e € &, ue € YH. Then (4’) = Vso isa 
presentation of A. a 


Thus we will deal only with event-state algebras from here, with no loss in generality. 


An interpretation from one event-state algebra to another is defined to be any interpretation 
between the embedded event algebras. This interpretation is valid if and only if the interpretation 


between embcdded event algebras is valid. 


4.2.2 Possibilities Maps 


Because we are using states to describe the valid execution sequences of an event-state algebra, it 
is natural to use these states in proving that an interpretation between event-state algebras is valid. 
Capturing execution histories with states allows us to specify valid execution sequences inductively, by 
extending the event mapping of an interpretation to a mapping between state sets, we will give an 


inductive technique for proving that the interpretation is valid. 


The state mappings we wil] define are somewhat unusual in that we allow a mapping from states 
at the lower level to sets of states at the higher level. We call these mappings possibilities maps (if they 
satisfy certain properties), because they give a set of possible higher-level states which correspond to each 
lower-level state. Because the states in an event-state algebra can represent any convenient summary of 
execution histories, it is possible that the higher-level state might retain more information about 
executions than the lower-level state. In this case there is not enough information in the lower-level state 
to uniquely determine the higher-level state. Thus we permit “looser” mappings which specify the set of 


states which are consistent with (are "possibilitics" for) a given lower-level state. 


Possibilities maps are particularly useful when the lower-level state is distributed, and the 
higher-level algebra is a global interpretation of the lower-level algebra. (It might be convenient to specify 
a distributed algorithm in terms of a “virtual” global state, for example.) Because the lower-level state is 
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partitioned among components, each component has only partial knowledge of the total system state. 
Thus there will generally be several higher-level states which are “possibilities” given the state at an 
individual component. This "partial information" property of distributed systems makes possibilities 


maps a natural tool for describing interpretations of these systems. 


Possibilities maps can be regarded as a gencralization of the standard notion of homomorphism. 
The state mapping of a homomorphism is a single-valued function, because the higher-level state space is 


always "more abstract” (has /ess detailed information) than the lower-level state space. 


if A, = (6,. 2), 0}; 1) and A, = &,, 2, o>, T») are event-state algebras, then we will 
write h: A, > A, if h: &, > 8, and h: 2, > Hz). (We use “h" to denote both the event mapping 
and the state mapping; the meaning will be clear from context.) Note that h: A, > A, does not 
necessarily imply that h satisfies any special properties; in particular, h necd not be a possibilities map. 


We say that the proper domain of h (of the state mapping) is: domain(h) = {s € Z,: h(s) # S}. 


We extend a mapping h: Z, > HZ,) to a mapping h: > —> HZ) by defining h(<s,,s,,....,>) 
= {<tt,,..., (>: § € h(s), fori = 1,2,...,.n}. 


Defn 4.2.2.1: Let A, = &,, 2), o}, T,) and A, a (6,, z,, o,, T) be event-state algebras, 
and let h: A, > A, We say h is a possibilities map iff 


1. h preserves initial states: 
a, € h(a,) 
2. h preserves events 


s € PRE,(e)&%,, t Eh(s) N &, 
= (th(e) € h(se) 


(Note that (t)h(e) € h(se) = ()h(e) # L, since h(se) C &.,) 


In many cases we will not need the full power of possibilities maps to map from states to sets of 
states. Ifa mapping h: A, -> A, has the property that Vs € %, Ih(s)| < 1, then we will consider h to 
be a partial mapping from 2, to 2», and we will change our notation accordingly. (For example, we will 


write t = h(s) instead of t € h(s).) 


We will use the properties of possibilities maps to prove inductively that a mapping is valid. As 
an intermediate step, we define the notion of a faithful mapping. We then show the main result for 


possibilities maps: any possibilities map is a valid interpretation. 


Defn 4.2.2.2: Let A, = (,, x) 0}, 7) and A, = (,, 2, o>, 7») be event-state 
algebras, and let h: A, > A,. For k € WN, we say that h is k-faithful iff (Wv € 1: Ivl < k), 
o,h(v) € h(c,v). We say h is faithful iff h is k-faithful for all k € N. Note that h preserves 
initial states if and only if h is 0-faithful. 


Lemma 4.2.2.3. Let A, = (8, Z).9;,7,) and A, = ,, Z,, 0, T)) be event-state 
algebras, and let h: 4, > A,. Then h is faithful = h is a valid interpretation. 


Proof: h is faithful = o,h(v) € h(o,v) Vv € %. Thus o,h(v) # 1 = h(v) € G. | 


Lemma 4.2.2.4: Let A, = 6, >> 7) and A, = (,, 25, 0», T,) be event-state 


YV o}, 
algebras, and let h: A> A. Then h is a possibilities map = h is faithful. 


Proof: Suppose h is a possibilities map. Then h preserves initial states = h is 0-faithful. We 
show h is k-faithful = h is k + 1-faithful. 


Let ve € %, |v] = k,e € &,. Since h is k-faithful, o,h(v) € h(o,v). But ve€ | = a v€ 
PRE,(e) NM %,. And o,h(v) € h(o,v) nN %,. Since h preserves events, o,h(v)h(e) € 
h(o,ve), = his k+1-faithful. | 


Lemma 4.2.25: Let A, = (6,, 2},9,,7;) and A, = (&,, Z,, 0, T) be event-state 
algebras, and let h: A, > A,. Then h is a possibilities map == h is a valid interpretation. 


Proof: Immediate corollary of Lemmas 4.2.2.4 and 4.2.2.3. a 


We will often find it useful to prove preservation of events in two parts: We will assume that 
preconditions are satisfied and show that transitions behave correctly under the interpretation; we will 


show separately that preconditions are satisfied: 
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Lemma 4.2.2.6: Let A, = (€,, 2), 9, 7,) and A, = (6,, Z>, Fy 7») be event-state 
algebras, and let h: A, > A,. Then h preserves events if and only if 


1. h preserves transitions: 


s € PRE,(e) N%,, t € h(s) M PRE,(h(e)) N %, 
= (t)h(e) € h(se) 


2. h preserves preconditions: 


s € PREC) N%,, t€ HS) NS, 
= t€ PRE(h(e)) 


Proof: Suppose h preserves events. Then 
s € PRE,(C)N&,, t€ h(s)N S,, 
= (t)h(e) € h(se), 
= (t)h(e)# 1, 
= t€ PRE,(h(e)), so h preserves preconditions. 


s € PRE,(e)N %,, t € h(s)M PRE,(h(e))N %,, 
= s€PRE(c)N, t€EhsNB,, 


= (t)h(e) € h(se), so h preserves transitions. 


Conversely, suppose h preserves preconditions and transitions. Then 
s € PRE,(e) N&R, t€ h(s)N B,, 
= t€ PRE,(h(e)), since h preserves preconditions. 

Thus s € PRE,(e) N %,, t € h(s) M PRE,(h(e)) N &,, 


=» (t)h(e) € h(se), since h preserves transitions. | 


4.2.3 Canonical Possibilities Map 


We can show that the method of constructing a possibilities map between event-state algebras is 
a completely general technique for proving validity: Given any (event-homomorphic) valid 
interpretation, an extension of this interpretation to a possibilities map always exists: 


Lemma 4.23.1: Let A, = é,. 2) o) 1) and A, = 6,, 2 Oy T») be event-state 
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algebras, and h: 6, > 8, be a valid interpretation from A, to A,. Then if we extend h toa 


state-set mapping as follows: 
h(s) = {o,h(u): ou= s} 
then h is a possibilities map from A, to A,. 


Proof: First we show that h(s) € 2, for alls € 2, (ie. 1 # h(s)): ohu=s = u ev, 
= h(u) € ¥, (since h is valid) => o,h(u) # 1. Thus h does define a mapping from Z, to 
HE,). 


Now we show that h satisfies the conditions for a possibilities map: 


1. h preserves initial states: 


. o,=0,A = o,h(A) (since h is event-homomorphic); 
o5h(A) € {o,h(u): o,u = o,} = h(o,). 


2. h preserves events; 


s € PRE,(c) ‘al BH, tE h(s)N B,. 

Lets = ow, vet. 

Sot € h(s) = {o,h(u): ou= o,V}, 

= t= o,h(u) for some u: o,u = o,V. 


Now (t)h(e) = 6 ,h(u)h(e) = o,h(ue) 
€ {o,h(w): o,w = o,ve} = h(se). 8 


Note that the set h(s) = {o,h(u): o,u = s} corresponds intuitively to the “possibilities” for 
higher-level states associated with lower-level state s: The sequences {u: o,u = s} are the possible 
histories which might have generated state s; o,h(u) is the higher-level state that would have resulted 
from execution of u. Thus if we only know state s, then we can only “pin down" the possible higher-level 


state to the set {oh(u): o,u = s}. 


ce 


4.2.4 Invariants 


We have reduced the task of showing that interpretation is valid to the task of proving that a 
mapping (on both states and events) is a possibilities map. It will often be convenient to use properties of 
reachable states (at both the higher and lower levels) in showing that a mapping is a possibilities map. We 
generalize the notion of an invariant to include properties of sequences of states as well as properties of 
single states. We also describe properties of individual components of the state, since we will show below 
that if a component is preserved by a state mapping between algebras, then in some cases we can Carry 
invariants proved at the higher level for this component downward to the lower level (without re-proving 
the invariants at the lower level). Our development of an event-state algebra hierarchy for a transaction 


system will make extensive use of this method of carrying invariants down from higher level algebras. 


4.2.4.1 Basic Definitions 


Defn 4.2.4.1.1: Let A = (6, 2, o, r) be an event-state algebra. If] C =" we say that J is an 
n-ary property in A, Ifn = 1, then we will simply say "] is a property,” and ifn = 2, we will 
say "I is a pair-property.” 


Defn 4.2.4.1.2: Let A= (&, 2, 0, 7) be an event-state algebra, and let k € N. If] is an n-ary 
property in A, we say that J is k-invariant in .A iff the following is true: For all sequences 
(V).VoeV,) € Y such that Vv, Sv, <.. < v,, and |v,] < k, we have (ov,,0v,,..,0V,) € 1. 
We say that J is invariant in A iff Vk € N, 1 is k-invariant in A. Thus | is invariant in A iff 
aM 4) CL 


We will usually drop the qualification “in A" when the algebra is clear from context. Note 
that the case n = 1 corresponds to the usual notion of an "invariant." When we say that "I is 
an invariant,” we will generally mean that I is a 1-ary property which is invariant. Similarly 


we will say "J is a pair-invariant” if J is a pair-property which is invariant. 


4.2.4.2 Relative Invariants and Relative Possiblities Maps 


To prove that a particular mapping is a possibilities map, we will frequently prove first some 
useful invariants for the higher and lower-level algebras. If we organize a proof hierarchically (with 
several levels of event-state algebras), we might find that we necd the same invariants at several of these 
levels. While we could prove the needed invariants independently at each level, to do so might repeat a 
lot of work unnecessarily. Since faithful mappings map reachable states into reachable states, it might be 
easy to infer that higher-level invariants hold at the lower level if we knew that the mapping between 
algebras were faithful. In some cases, however, we might want to use these invariants to show that the 
mapping is a possibilities map (and hence is faithful, by Lemma 4.2.2.4). In these cases we are faced with 


a mutual dependency between invariants and a possibilities mapping. 


Our solution to this mutual dependency depends on the fact that both invariants and 
possibilities maps are generally proved inductively. Conceptually, then, we will prove both an invariant 
and the possibilities map together with the same induction. For convenience, we separate the 
dependencies in our definitions, we define an invariant relative to a mapping, and a possibilities map 
relative to a property. Because the key property of possibilities maps is faithfulness, we also define 
faithfulness relative to a property, and we prove a lemma which is the “relative” version of Lemma 


4.2.2.4. We also state a "relative" version of Lemma 4.2.2.6. 


Defn 4.2.4.2.1: Let A, = (6,, 2) Oo}, 7) and A, = (6,, z,, 0», T,) be event-state 
algebras, and let h: A, - A,. Let PC 2, bea property in A). 


We say h is a possibilities map relative to P iff 


1. h preserves initial states (o, € h(o,)) 
2. h preserves events relative to P: 


S€ PREC) NR NP, cE HS) NB, 
=> (t)h(e) € h(se) 


Defn 4.2.4.2.2: Let A, = (8), 2), 0), 7)) and A, = (6,, 2,, o,, 7) be event-state 
algebras, P C 2), and let h: A, + A, 
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We say h is faithful relative to P iff 


1. his 0-faithful 


2. (Vk € N) his k-faithful, and P is k-invariant = h is k+ 1-faithful 


Lemma 4.2.4.2.3: Let A, = (6,, Z},0},7,) and A, = (6,, Z,, 0,,7,) be event-state 
algebras, PC 2%, h: A, — A,. Ifh is a possibilities map relative to P, then h is faithful 


relative to P. 


Proof: h preserves initial state, so h is 0-faithful. Now suppose h is k-faithful, and P is 
k-invariant, for some k € N. We show h is k + 1-faithful. 


Take v € %, lv] < k+1. We must show that o,h(v) € h(o,v). If lv] < k then the result 
follows directly since h is k-faithful, so assume |v| = k+1. Let v = ue, for some u € T, e € 


6, (jul = k). 


Since h is k-faithful, o,h(u) € h(o,u). But P is k-invariant, so ou € P. Since ue € qt, ou € 
PRE, (e). 


Thus ou € PRE,(e)N %, NP, o,h(u) € h(o,u)NB,, 
= o,h(u)h(e) € h(o,ue), since h is a possibilities map relative to P, 
= o,h(v) € h(o,v). | 


Lemma 4.2.4.2.4: Let A, = (6,, 2) 0}, 7,) and A, = (6,, 2,, o» T) be event-state 
algebras, and let h: A, > A,. Then h preserves events relative to P if and only if 


1. h preserves transitions relative to P: 


s € PRE,(e)N %, MP, t€ h(s) M PRE,(h(e)) N &, 
= (th(e) € h(se) 


2. h preserves preconditions relative to P: 


s € PRE,(e)N%, NP, t€ HG) NB, 
= t € PRE,(h(e)) 


Proof: Similar to the proof of Lemma 4.2.2.6. 


Defn 4.2.4.2.5: Let A, = (6), 2), 0,,7,) and A, = (€,, Z,,0,, 7,) be event-state 
algebras, P C %,, and let h: A, > A, 


We say P is invariant relative to h iff 


1. P is 0-invariant 


2. (Wk € N) P is k-invariant, and h is k+1-faithful = P is k +1-invariant 


We now show that we can prove invariants and possibilities maps together with the same 


induction: 


Lemma 4.2.4.2.6. Let A, = (6), 2),0),7)) and A, = (8), 2,, 0, 7,) be event-state 
algebras, P € Z,. Let h be a possibilities map from A, to A, relative to P, and let P be 


invariant in A, relative toh. Then h is a possibilities map, and P is invariant in A). 


Proof: Since h is a possibilities map relative to P, h is faithful relative to P (by Lemma 
4.2.4.2.3). We show inductively on k that P is k-invariant, and h is k-faithful. P is 0-invariant 
and h is 0-faithful by definition. Assume P is k-invariant, and h is k-faithful. Since h is a 
faithful relative to P, h is k+1-faithful. Since P is invariant relative to h, P is k + 1-invariant. 


Thus P is invariant. To sce that h is a possibilities map, note that h preserves initial states by 
definition. h preserves events, because 


s € PRE,(e) %, = s € Psince P is invariant. i 


4.2.4.3 Invariants on Fixed Subspaces 


We have described a process for proving an invariant simultaneously with proving a possibilities 
map. In this section we show that this technique can be useful when a particular subspace of the 


lower-level state space is unchanged by the state mapping. 


Defn 4.2.4.3.1: Let A = (&, 2, o, 7) be an event-state algebra, let J be some index set, and 
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let Z be the Cartesian product of component sets, I’, for N € J. We say that N is the name 
of component I’. We assume that each component has a unique name. (We will frequently 
denote a component by a variable name used for an instance of the component set. For 
example, if 2 = ', X F’,, and we use <A,B> € & to represent an instance of the state, then we 
will refer to the "A-component," or the "B-component.") Let N,.N,.-. Ny be distinct names 
from J. We say that rN, x Ty, wa MK TN is a subspace of 2, with name N = <N,,N,,....N)>. 
(Note that each such composite name denotes a unique subspace.) If s € 2, then “s.N" 
denotes the projection of s onto the subspace named by N. Ifs = €S) ,855---5S,> is a vector of 


States, then s.N is defined in the obvious way as <s).N,s,.N,...8,-N>. 


We extend the definitions of n-ary properties and invariants to properties which only depend on 


a particular subspace. 


Defn 4.2.4.3.2: Let A= (8, 2, a, r) be an event-state algebra, and let N name a subspace, 
I, of Z. If 1C Fr", we say that J is an n-ary property for N. 


Let I be an n-ary property for N, and let I’ = {s € 2": s.N € I}. We say ] is invariant for N 
in A iff Y is invariant in A. If k € N, then we say ] is k-invariant for N in A iff I’ is 


k-invariant in A. 


Invariants for a subspace are of interest when the state mapping between two algebras fixes that 
subspace. We will show below that invariants for a fixed subspace can be "carried down" to the 


lower-level algebra. 


Defn 4.2.4.3.3: Let A, = (6), 2), 6,,7,) and A, = (6,, Z,,0,, 1,) be event-state 
algebras, and let h: A, -» A,. Suppose that the state spaces of A, and A, both contain a 
subspace, I’, with name N. We say that b fixes N iff for all s € x) and for all t € h(s), LN = 
s.N. (Thus h does not change the N-subspace of the state.) It is straightforward to show that 
if h fixes N, s € =}. and t € h(s), then t.N = s.N. 


Now we show how we can carry higher Ievel invariants for fixed subspaces down to the lower 
level. Because we might want to use these invariants in inductive proofs (in particular, as we explain 


below, in inductive proofs of other relative invariants for the lower level), we state this lemma in 


“parameterized” form (i.e. in terms of k-invariants and k-faithful mappings). 


Lemma 4.2.4.3.4: Let A, = (6), 2,,0),17,) and A, = (6,, 2,,,, 7) be event-state 
algebras, let k € N, and let h: A, — A, be k-faithful. Let N name subspace I in both 2, 
and 2,, and suppose that h fixes N. If n-ary property I C Tr? is invariant for N in A, then I 


is k-invariant for N in A,. 


Proof: Let 12 = {t € 29: t.N € I}, 1] = {s € 22: s.N € I}. [is invariant for N in A,, so %$” 
C 12. We must show that that J is k-invariant in A,. Let <v,,v,,...,.V,> € v. with vv) 
Vy» and lv.I < k; we show that s = 6 1V},0 1 5-507V,> € I. Since h is k-faithful, o,h(v,) 
€ h(o,v,) fori = 1,2.n. Lett = <o,h(v,),o,h(v,),...65h(v,)>. Then t € 95”), because each 
o,h(v,) € K,, and h(v,) < h(v,) = h(v.) since h is event-homomorphic. 


Since t € %{") t € 12; thus t.N € I. Butt € h(s), and h fixes N, sot.N = s.N. Thuss.N € I, 
= s€ll. & 


Because a mapping which is a possibilities map is necessarily faithful, and hence k-faithful for 


all k, we have the following lemma: 


Lemma 4.2.4.3.5: Let A, = (6). 2), 0,7) and A, = (6,,, zy oy 7) be event-state 
algebras, and let h: A, -> A, bea possibilities map. Let N name subspace I in both 2, and 
2,, and suppose that h fixes N. If n-ary property 1 € I" is invariant for N in A, then I is 


invariant for N in A,. 


Proof: Immediate corollary of Lemma 4.2.2.4 and Lemma 4.2.4.3.4. &l 


We showed in Lemma 4.2.4.2.6 that if h is a possibilities map from A, to A, relative to property 
P, and P is invariant for A, relative to mapping h, then it follows that h is a possibilities map. Because of 
Lemma 4.2.4.3.4, we can use known invariants for fixed subspaces in A, to prove that P is invariant 
relative to h. Note that in proving that P is invariant relative to h, we can assume that h is k + ]-faithful 
(instead of simply k-faithful) when showing P is k + 1-invariant. By Lemma 4.2.4,3.4, we can thus assume 


that invariants from A, for fixed subspaces are k + ]-invariant. 


We will generally apply Lemma 4.2.4.3.4 to l-ary or 2-ary invariants. We summarize the 
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(s.N, tN) € J 
=» t€ P (by the Induction Hypothesis of the Lemma), 


=» Pisk+l-invariant. § 


It is important to understand exactly what Lemma 4.2.4.3.6 says: We cannot assume that the 
higher-level invariants (I and J) are truly invariant in A, but we can assume they are k+ /-invariant for 
the induction step of showing P invariant. Because we construct the induction so that faithfulness of h 
stays "one step ahead” of invariance of P, we can assume both t.N € I, and (s.N,t.N) € J above. (If we 


only knew that h were k-faithful, instead of k+1-faithful, then we would only be able to assume s.N € I.) 


4.2.5 Augmentation Maps and Auxiliary State 


The power of possibilities maps to map a single state into a set of states is useful when the 
lower-level algebra is somehow “more abstract” than the higher-level algebra. If the higher-level model 
retains more information about a system than a lower-level model, then the low-level state will not 
uniquely determine the high-level state. Another technique for showing a valid interpretation from one 
algebra to another is to augment the lower-level state with auxiliary variables. These variables are 
"virtual" components of the state, in that they do not enter into any preconditions for events, and the 


transition effects on other components of the state are not affected by the auxiliary variables. 


Defn 4.2.5.1: Let A, = &,, 2), 0}, 1) and A, = (6,, z,, o> 7.) be event-state algebras. 
We say that .A, is an augmentation of A, with auxiliary state Aux iff 


2. 2, = 2, X Aux 
3. 6, = (o},ap) for some ap € Aux 


4. Ve €&), PRE,(e) = PRE,(e) X Aux (ie. the auxiliary state enters into no 
preconditions) 


5. (s,a) € PRE,(e) = 1,(eXs,a) = (r,(eXs),a’) for some a’ € Aux (i.e. the 
auxiliary state does not affect transitions) 
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If A, is an augmentation of A, with Aux, then we define the augmentation map, h: A, > 
A, as follows: 


Ve€&,, h(e)=e 
Ws€ zp h(s) = {s} X Aux. 


Lemma 4.2.5.2; Let A, = (6), 2), 0,7) and A, = (6,, 25, 6,, T,) be event-state algebras, 
and let A, be an augmentation of A, with auxiliary state Aux. Then 2) is a subspace of z 


and the augmentation map, h, fixes 2,. 


Proof: Straightforward from the definition. I 


The following lemma shows a relationship between the technique of using auxiliary state, and 


the technique of defining a possibilities map: every augmentation map is a possibilities map. 


Lemma 4.2.5.3: Let A, = (,, 2), o}, 7) and A, — 6,, z,, o>, T) be event-state algebras, 
and let A, be an augmentation of A, with auxiliary state Aux. Then the augmentation map, 


h, is a possibilities map. 
Proof: 


1. h(o,) = {(0,,a): a € Aux}, 
= (0;,4) € h(o,). 


2. Let s € PRE,(e) MN %,, t€ h(s)N B,, 
=> t = (s,a) forsomea € Aux. 


s € PRE,(e) = (s,a) € PRE,(h(e)) = PRE,(e), 
=> (Hh(e) = te = 7(eXt) = (r,(eXs).a’) for some a’. 

- But h(se) = {(se,a): a € Aux} = {(7,(eXs),a): a € Aux}, © 
== (t)h(e) € h(se). ff 
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4.3 Distributed System Model 


We model a distributed system as a special type of event-state algebra. First we present a 
general framework for a “distributed algebra,” and then we specialize further to a particular model for the 
distributed environment of our transaction system. While these models have considerably more structure 
than an arbitrary event-state algebra, it is important to note that they can still be described as special cases . 
of event-state algebras. Thus we can apply our results for possibilities maps and invariants directly to 


these distributed algebras. 


43.1 Distributed Event-State Algebras 


Defn 4.3.1.1: Let A = (&, 2, o, 7) be an event-state algebra, let J be a finite index set, and 


let orig be a mapping orig: 6 — I. We say that .A is distributed over J using orig provided 
that the following are true: 


a, & is the Cartesian product of sets 2, for i € I. We will use index i as the 
component name for set z,. 


b. o is a vector of initial states, 0; € 2, fori€ I. 


c. For each i € I, there is a local transition relation +, & X 2, X 2. 1, must 
satisfy the following “local precondition” property: Ife € &,s € %,, and orig(e) 
# i, then 7,(eXs) ¥* |. Then 1 is determined by the local transition relations as 
follows: + = {(e,s,t): (e€,s.i,t.i) € 7, Vi€ I}. 


If orig(e) = i, then we say that component i is the originator of event e. 


Because the transition relation of a distributed event-state algebra is defined by combining local 
transition relations for each component, the effect of each event on a component depends only on the 
current state of that component. It is possible for an event to affect several components, however. (Thus 


we are permitting an arbitrary “interconnection” of components through events.) 


Although an event can have effects at several components, its precondition must be local to its 


originating component. Only the originator can control when one of its own events can occur. 


ts 


In [Lynch82], a “local mapping” technique is explored for constructing a possibilities map from 
a distributed event-state algebra to another event-state algebra; the possibilities map is defined as the 


intersection of local possibilities maps from the states of all components in the distributed algebra. 


4.3.2 Message-based Distributed Algebras 


We now restrict distributed event-state algebras further to model the particular distributed 
environment of this thesis. The basic system components are nodes, with local state spaces and local event 
sets. All communication between nodes must flow through a distinguished system component, the 
message buffer. We define distinguished send and receive events for communications through the 


message buffer. 


We give the message buffer a specific semantics: We postulate that messages are delivered in 
arbitrary order after they are sent, and that they can arrive any number of times (including 0) after they 
are sent. These assumptions allow us to model the message buffer as a se of messages (the set of all 
messages ever sent). It is never necessary to remove a message from this set, because we assume that 


messages can be duplicated and delayed arbitrarily. 


Defn 4.3.2.1: Let A = (&, 5, o, 7) be an event-state algebra distributed over I using orig. 
Let Nodes be a finite set of nodes, let Msgs, r be a set of messages from node i to node j (ij € 
Nodes), and let Msgs = U Mees, ; - We say that A is a message-based algebra over Nodes 
using Msgs if the following are true: 


a. I = Nodes U {buf}, where "buf" names the message buffer component. 
b. Zyur= SHMsgs) (i.e. the message buffer is a set of messages). Let BUF = 2, > 


c. o.buf = @ (the message buffer is initially empty; thus no message can be 
received before it is sent). 


d. Let Comm = {send M: M € Msgs} U {receive M: M € Msgs} be the set of 
communications events. Then Comm C 6. If M € Msgs, i then orig(send M) = i, 
orig(receive M) = buf: The originator of a send event is the source node for the 
message, and the originator of a receive event is the message buffer. (We regard the 
destination node for a message as passive in the communications process.) 
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e. Ife € & - Comm, and i ¥ orig(c), then 7,(c) is the identity on ~,. Thus all “local” 
events (events not in Comm) must have only local effects. (Note that preconditions 
must be local by the definition of a distributed algebra). 


f. IFME Msgs. ; then 1,(send M) is the identity on 2,, for k # buf. 7,(send M) 
C {(a,a): a € 2}. Thus although the sender of a message imposes a precondition 
on the sending of a message, the send has no effect on the sender's state. 


8. 7,,¢send M) = {(b, b U {M}): b € BUF}. Thus the effect of a send event on 
the buffer is to add the message to the buffer. 


h. 1, (receive M) is the identity on Z,, for k * j,buf. 7,(receive MXa) # _L, Va € 
z;. Thus receipt of a message affects the state of the receiver, but the receiver 


cannot impose a precondition on receive events. (The originator of a receive event 
is the message buffer.) 


i. T pylreceive M) = {(b,b): b € BUF A M € b}. Thus a receive event for a 
message can occur whenever that message is in the buffer. A receive event has no 
effect on the state of the message buffer, however. (Messages are never removed). 


We stress that the message semantics we have chosen is not inherent in the distributed algebra 
framework; this semantics is simply convenient for describing our system. Our message-based model 
could be changed easily to provide for different communications semantics. For example, we could 
model a “reliable" communications system by making the message buffer an ordered list of messages, 


which only delivers messages from the head of the list and removes them upon delivery. 
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5. Proof Strategy 


In the following chapters we will specif) several levels of an event-state algebra hierarchy; these 
algebras model a distributed transaction system. The algebras are presented in top-down order: the 
top-level algebra (Level 0) is the “most abstract,” and the bottom-level algebra (Level 7) is the “most 
concrete.” At each level we specify the state, the initial state, the events, and the transition relation. At 
each level (except Level 0) we also specify a mapping from the new level to the previous (higher) level, 


‘and we show that this mapping is a possibilities map. 


Our goal is to show that the orphan detection strategy which we outlined in Chapter 1 
guarantees view-serializability. Thus our top-level model specifies our “correctness condition:" The 
Level 0 state is just the set of all action trees, and we define simple events to create, commit, and abort 
actions, and to perform an access. The only preconditions at Level 0 require that each state generated by 


any event be view-serializable. 


At Level 1 we add a data ordering to the state (thus states are now augmented action trees). We 
impose preconditions on events to restrict the reachable states to view-serializable AAT’s. We define the 
set of aborts "depended on" by an action; as one of our preconditions we require that each state 
generated by any event satisfy condition ANC-ABORT -- no action can depend on an abort of one of its 
ancestors. We then show that all reachable AAT’s in Level 1 are view-serializable. Thus the obvious 


mapping from Level 1 to Level 0 is a possibilities map. 


At Level 2 we remove the ANC-ABORT condition by adding a precondition to perform events. 
This precondition essentially states that an access should not see an abort dependency on an ancestor at 
the time it is performed. We show that the reachable states in Level 2 satisfy ANC-ABORT (using this 
new precondition); thus the obvious mapping from Level 2 to Level 1 is a possibilities map. We refer to 


this precondition as the “orphan detection” precondition. 


Levels 0 - 2 are global state algebras, in that we regard the transaction system as operating on a 
single global state. These levels can be thought of as “centralized” interpretations of the events in a 
distributed action system. Lower levels progressively “distribute” this global state and localize the 


preconditions and effects of events. 


At Level 3 we introduce “locations,” which can be thought of as abstract nodes. Each action and 
each object has its own location. The information at a location consists of a (local) unlabeled action 
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summary, plus the datastep ordering from the AAT. We define very simple “communications” events to 
transfer information to any location. We show that it is relatively simple to localize all preconditions 
except for the orphan detection precondition. The orphan detection condition must still be expressed in 
terms of the global AAT. The implication of this result is that our communications steps at Level 3 do not 


include enough information to completely localize orphan detection. 


At Level 4 we introduce value maps -- a data structure which models the locks and versions of 
atomic objects. The Level 4 state consists of an AAT, a “local state" mapping from locations to UAS’s, 
and a value map for each object. We regard the value map as a local data structure (conceptually each 
object has its own value map.) We replace some of the preconditions on perform events with 
preconditions on value maps, and we modify the transition effects of actions to update value maps 


appropriately. 


At Level 5 we succeed in localizing the orphan detection precondition by piggybacking abort 
information on the create and commit communications events. This abort information models the 
DONE lists of our simplified orphan detection algorithm. The key invariant proved for Level 5 states 


that each location always has “enough” abort information. 


Because all preconditions are localized at Level 5, the global AAT can be regarded as a “virtual” 
component of state. We project out this global state at Level 6, and we construct a trivial augmentation 
map between Level 6 and Level 5. Although the resulting algebra is “localized,” it does not quite fit our 
definition of a "distributed" event-state algebra. To define a distributed event-state algebra, we must 
assign “locations” (abstract nodes) to physical nodes. An additional complication results from the 
simplicity of our communications events at Levels 3 - 6: The transfer of information caused by these 
events is considered instantaneous at these levels. For a distributed event-state algebra we must model 


arbitrary communications delays. 


Level 7 presents a distributed event-state algebra. Many actions and objects can reside at a 
single node, and messages are sent asynchronously via a message buffer. In mapping from Level 7 to 
Level 6, we account for the communications delays in the message buffer by considering messages 
themselves to be abstract “locations.” (One way to think of this device is to imagine that at Level 6 we 
can consider all communication events to be instantaneous, but all messages are sent via a third party. At 
Level 7, we “know” that this third party is really the message buffer, but at Level 6 this detail is not 
necessary.) 


War eet 
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6. Global State Models 


This chapter presents Levels 0 - 2 of the event-state algebra hierarchy. Level 0 describes our 
correctness condition: every action tree generated by the system must be view-serializable. Level 1 is a 
global state model based on AAT’s. Level 1 develops the crucial link between view-serializability and 
orphan detection: We define the “aborts dependency set” for an action in an AAT, and we require at 
Level ] that no action can depend on an abort of one of its ancestors. Informally this condition, which 
we call ANC-ABORT, means that no action can "know" that it is an orphan. At Level 1 we show that the 


ANC-ABORT condition (along with other preconditions on events) implies view-serializability. 


The ANC-ABORT condition is imposed at Level 1 by requiring that the next state generated by 
each event satisfy ANC-ABORT. At Level 2 we replace this restriction with a single precondition on data 


accesses, and we show that this precondition suffices to guarantee ANC-ABORT. 


We also make use of an auxiliary algebra, which we call "Level A" (denoted La). Level A 
consists of Level 1 without the ANC-ABORT restriction. Thus Levels 1 and 2 are both logically “below” 
Level A. The advantage of using this auxiliary level is that we can easily construct a possibilities map 
from Level 2 to Level A; we will then use Level A invariants in showing that there is a possibilities map 
from Level 2 to Level 1. | 


We will use the following distinguished symbols to define the initial states of the algebras: 


Tp denotes the trivial AAT containing only vertex U with status ‘active’, and an empty data ordering: 


vertices, = {U} 

stat U) = ‘active’ 
usy ( ) ve 

label, =@ 

ae =@ 


T, = erase(T,) (an action tree), and T,, = unlabel(T;) (a UAS). 


-79- 


6.1 Level 0 Algebra 


The Level 0 state consists of a (global) action tree. The events at Level 0 are just those needed to 
create an action tree: we define events to create an action, commit and abort an action, and perform an 
access with a given value (this value gives the label of the datastep in the action tree). The only constraint 


on validity of an execution sequence at Level 0 is that the resulting action tree must be view-serializable. 
LO= (6), Ly Sy T)) 

&) = {create A, commit A, abort A, Seren A,u} (see below). 

2, is the set of all action trees. 

6, = T,, the trivial action tree. 

7, the transition relation, is specified below via preconditions and transition effects for each event: 


Let the current state be T. For each event, we give the transition function which maps T -> T1. The 
precondition for each event is a logically a condition on T, the current state, but we specify it as a 
condition on T1. (Since T uniqucly determines T1, a condition on T] maps directly into a condition on 
T.) The single precondition for each event requires that the next state (T]) be view-serializable. Let VSR 


denote the set {T: T is a view-serializable action tree}. 


1. create A (A € act - {U}) 
PRECONDITIONS: 
a. T1 € VSR 
TRANSITIONS: 
a. vertices,, + vertices, U {A} 


b. status,,(A) + ‘active’ 


2. commit A (A € act - {U} - accesses) 
PRECONDITIONS: 
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a. T1 € VSR 
TRANSITIONS: 


a. status, (A) + ‘committed’ 


3. abortA (A € act - {U}) 
PRECONDITIONS: 
a. T1 € VSR 
TRANSITIONS: 


a. status (A) + ‘aborted’ 


4. perform A,u (A € accesses{x), u € values(x)) 
PRECONDITIONS: 
a. T1 € VSR 
TRANSITIONS: 
a. status,,(A) + ’committed’ 


b. label,,(A) Hu 


The following lemma justifies our statement that LO defines our correctness condition, because 


all reachable states in LO are view-serializable action trees. 
Lemnia 6.1.1: Let T € Kp. Then T € VSR. 


Proof: Let T = Ty, for some v € %: If v # A, then T = Te for some e € & TE PRE,(e), 
and by the VSR precondition for e, T € VSR. Ifv = A, then T = T; which is trivially in 
VSR. 8 
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6.2 Level 1 Algebra and Mapping hj) 


The Level 1 state consists of a (global) AAT. The events are identical to those defined at Level 
0, but we modify the preconditions as we begin to specify in detail how the transaction system functions. 


Ll = (,, 2), Co}, 1) 
§, = &) = {create A, commit A, abort A, perform A,u}. 
2, is the set of all augmented action trees. 
o, = Ty, the trivial AAT. 
7,, the transition relation, is specified below via preconditions and transition effects for each event. 


We will define a condition, ANC-ABORT, on AAT’s, which essentially states that an action 
cannot know that it is an orphan. We include a precondition for each event in L1 which requires that the 
next state generated by this event must satisfy ANC-ABORT. It will follow trivially that ANC-ABORT is 
satisfied by all reachable states in L1. 


6.2.1 Aborts Dependencies and Condition ANC-ABORT 


We want to develop a condition which will rule out execution sequences in which orphans see 
“inconsistent” data. To devise a condition which can distinguish "bad” orphans from orphans which are 
not dangerous, we define the sct of aborts upon which an action “depends.” The ANC-ABORT 
condition simply states that an action cannot depend upon the abort of any of its ancestors. 


Informally, an action depends on any abort which allowed the action to proceed. Because of 
sequential dependencies, any abort of a sequentially preceding sibling is depended on by its following 
siblings and their descendants. A parent also depends on the aborts of any of its children. Any abort 
which “releases a lock” on an object subsequently read by an action is depended upon by that action. 
Our Level 1 model does not have explicit “locks”; locks and versions are sepresented by the entire action 
tree. (Precondition P1.4b below is essentially a “lock” condition which says that two actions (at any level) 
cannot interfere on the same object: one must either commit or abort before the other is allowed to 
proceed. Precondition P}.4c is essentially a “current version” condition which says that the current 


version seen by a datastep is the result of all preceding accesses which are visible to it.) 
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An action also depends on all the aborts depended on by committed actions which might pass 
information to it (which for our purposes will be considered all the actions in its view set). Thus the 
aborts dependency set for an action is defined as the union over all actions in its view set of the 


“immediate aborts” preceding those actions. 


Defn 6.2.1.1: Let T be an AAT, A € vertices. We define the aborts dependency set of A in 


T as follows: 
A A) = i-precedes (B 
ABORTS,(A) Basi es,(B) 
We define the set ANC-ABORT as the set of all AAT’s in which no action depends upon the 
abort of an ancestor: 
Defn 6.2.1.2: ANC-ABORT = {T: WA € vertices, anc(A) N ABORTS,{(A) = D}. 


We also define a “sequential aborts set" which represents all the aborts upon which an action 


depends when it is first created, 


Defn 6.2.1.3: Let T be an AAT, A € vertices,. We define the sequential aborts dependency 
set of A in T as follows: 


z TS,(A) = i-anc-seq.(A) U ABOR B 
SEQ-ABORTS,(A) a,(A) a PAB TS(B) 
The following lemma relates the sequential aborts set of an action to the sequential aborts set of 
its parent: 
Lemma 6.2.1.4: Let T be an AAT, and A € vertices. IfA # U, then 
SEQ-ABORTS,{A) = SEQ-ABORTS,{parent(A)) U WaBorTs,(B) U i-seq,{A). 
; B v-eq-(A) 

And SEQ-ABORTS,(U) = &. 


Proof: It is obvious that SEQ-ABORTS,(U) = @. Take A * U. By definition of 
SEQ-ABORTS, 


Seg ARON TE (4) = iranc-seq,(A) U Py pet nh 
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But i-anc-seq,{A) = iseq,{A) U i-anc-seq,(parent(A)), and v-anc-seq,(A) = v-seq,{A) U 
v-anc-seq,{parent(A)). Thus 


SEQ-ABORTS,{A) = i-seq,(A) U i-anc-seq,(parent(A)) U Bde U 


ABORTS,{B) _. 
B vanc-seq, nt(A)) 


= SEQ-ABORTS{parent(A)) U ce U i-seq,{A). 8 
The following Jemma relates the flow of information via view sets to the flow of abort 
information via ABORTS sets: | 
Lemma 6.2.1.5: Let T be an AAT, A € vertices, B € vset,{A), then 
ABORTS,(B) € ABORTS,{A) 
Proof: ABORTS,{A) = cen , while ABORTS,(B) = eee ; 


But if B € vset,(A), then vset,(B) C vset,(A) by Lemma 2.3.3a. The lemma follows directly. 
a 


The definition of view sets as the closure under v-precedes, allows us to write a recursive 


expression for ABORTS,: 
Lemma 6.2.1.6: Let T be an AAT, A € vertices, then 


ABOR A)=r des,{A) U ABOR B 
TS,{A) = i-prece - ) alg Ae {B) 


Proof: vset,(A) = {A} U v-precedes}(A) 


= {A} U vset,{B) 
BC vp A) 


The Lemma follows directly. | 


Since action trees are always finite, we can use this recursive form in inductive proofs of 
properties of aborts sets if we show that tracing back the v-precedes, relation will not result in cycles, i.e. 
that VA € vertices, A€ v-precedes > (A). (If v-precedes,, were acyclic, then the induction might not be 
well-founded.) We will prove below that v-precedes, is acyclic for all reachable trees in La (and hence 


for all reachable trees in L1). 


6.2.2 Specification of Event Preconditions and Transitions for L1 


Let the current state be T. For each event, we give the the transition function which maps T > 
Tl. Preconditions are specified as a function of T, except for the ANC-ABORT condition which requires 
that the nexi state be in ANC-ABORT. 


1. createA (A € act - {U}) 
PRECONDITIONS: 
a. A € vertices, 
b. parent(A) € active, 
c. (B,A)€ seq, B# A = BE done, 
d. T1 € ANC-ABORT 
TRANSITIONS: 
a. vertices,, + vertices, U {A} 


b. status, ,(A) = ‘active’ 


2. commit A (A € act - {U} - accesses) 
PRECONDITIONS: 
a. A € active, 
b. children,(A) € done, 
c. Tl € ANC-ABORT 
TRANSITIONS: 


a. status,,(A) + ‘committed’ 
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3. abort A (A & act - {U}) 
PRECONDITIONS: 
a. A € active, 
b. T] € ANC-ABORT 
TRANSITIONS: 


a. status,,(A) + ‘aborted’ 


4. perform Au (A € accesses(x), u € values(x)) 
PRECONDITIONS: 
a. A € active, 
b. BE datasteps,(x) = B€ visible,(A,x) V BE dead, (A,x) 
c. u = result(x,s), where s = <<visible(A,x); data>> 
d. Tl € ANC-ABORT 
TRANSITIONS: 
a. status, ,(A) + ‘committed’ 
b. label,,(A) — u 


c. data;, + data, U {(B,A): B € datasteps,(x)} U {(A,A)} 


6.2.3 Specification of Mapping hyo 


We define the mapping hyp: L1 — LO in the obvious way. Our goal, of course, is to show that 
this mapping is a possibilities map. 


State Mapping 
hyp: 2, — 2p is defined by h,,(T) = erase(T), VTE 2). 


Event Mapping 
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* 
hig: 8, > & is the identity map on events. 


6.2.4 Proof Strategy for Showing hj, is a Possibilities Map 
We can show easily that hip preserves initial states and transitions: 
Lemma 6.2.4.1: hj, preserves initial states. 
Proof: h, (79) = erase(Tp) = T,, by definition. § 
Lemma 6.2.4.2: hj) preserves transitions. 


Proof: It is obvious by inspection that h,, preserves transitions, since transitions for all events 
are identical at levels LO and L] (except for transition T1.4c, which involves the data ordering 


-- but data, is projected out by the state mapping). & 


Showing that hy preserves preconditions is more difficult. We use the following lemma to 


reduce this problem to a view-scrializability condition on reachable states in L1: 


Lemma 6.2.4.3: Suppose that for all T € %,, T is view-serializable. Then hj, preserves 


preconditions. 
Proof: To show that h io Preserves preconditions, we must show that 
T € PRE,(c)N B,, hy(T) € Ky = hy (T) € PRE,(h,(e)). 


But the only precondition at Level 0 is that the nex? state must be in VSR. Thus 
hy(l) € PRE, (hy 9(c)) = hy (Th, (e) € VSR. 


Since h preserves transitions, hy (Th, gfe) = h, (Te) = erase(Te). Thus we must show that 
erase(Te) € VSR. Since view-scrializability of AAT’s is defined to be view-scrializability of 


the corresponding action tree, we must show that 
TE PRE,(c) M %,, crase(T) € %, = Te is view-serializable. 


But T € PRE, (e) a R, = Te€ R,. Thus it suffices to show that all reachable states in 11 


are view-serializabie. # 
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View-serializability of reachable states is thus our main theorem for 1.1, which will imply that 
hj preserves preconditions (and is thus a possibilities map). We state this theorem here, although its 


proof will be given in several stages: 
Theorem 6.2.4.4: Let TER tr Then T is view-serializable. 


The proof of this theorem consists of showing that for each action A € vertices,, viree,(A) is a 
serializable view tree for A. The proof that S = viree,(A) is a serializable view trec is given in three 
subordinate lemmas which show that (1) S is a view tree for A in T, (2) S is version-compatible, and (3) 
there are no cycles (of length 2 or greater) in SCg U sibling-datag. By Theorem 2.5.1, it follows that S is a 
serializable view tree for A. We state these lemmas here, although the proofs are deferred to later 


sections. 


Lemma 6.2.4.5: Let T € K,. let A € vertices, and let S = viree,{A). Then §S is a view tree 
for A in T. 


Lemma 6.2.4.6: Let T € Ry, Iet A € vertices, and Iet S = vtree,(A). Then §S is 


version-compatible. 


Lemma 6.2.4.7: Let T € BR, let A € vertices,, and let S= vtree-(A). Then Sg U 
sibling-data, has no cycles of length two or greater. 
6.3 Auxiliary Algebra La 


We define an “auxiliary” event-state algebra, La. (La is “auxiliary” because it is not part of our 
main event-state algebra hierarchy.) La is identical to L1, except that the ANC-ABORT preconditions on 
events (preconditions P1.1d, P1.2c, P1.3b, and P1.4d) are omitted. 


We define the trivial mapping h,,: L] -> La as the identity map on states and events. 
Theorem 6.3.1: hj, is a possibilities map. 


Proof: Since initial states are identical in L] and La, and h,, is the identity on states, h,, 
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preserves initial states. .Since all transitions and preconditions in Level ] also appear at [evel 
A, hi, must preserve transitions and preconditions. Thus h,, is a possibilities map, by Lemma 


422.6. 


Since this mapping fixes T (it must fix T since T is the entire state), we will show that all invariants (and 


pair-invariants) for La are invariant (or pair-invariant) for L1. 


We prove below several basic lemmas for algebra La. We will then apply these results to the 


proofs of ].emmas 6.2.4.5, 6.2.4.6, and 6.2.4.7, 


The advantage of defining Ia is that we will also construct a trivial possibilitics map between 
algebra 1.2 and algebra La. We will thus be able to apply Level A invariants directly to Level 2, and we 


will use these invariants to show that h,, ( defined below) is a possibilities map. 


6.3.1 Basic Lemmas for La 
6.3.1.1 Invariants and Pair-Invariants for La 
Lemma 6.3.1.1: Let (1,71) €&), and let A € vertices,. ‘Then the following are true: 


a, vertices; € vertices,,, committed, C committed,.,, aborted, € aborted,,, 
data, € data,, 


b. IFA E datasteps,. then label,(A) = label,.,(A) 

c. IFA E datasteps, and (B,A) € data,,, then (B.A) € data, 

d. visible (A) € visible, (A) 

e. dead {A) C dead,,(A) 

f. If A is live in T1, then A is live in T _ 

g. If A is dead in T, then A is dead in T1 and {crucial...(A)} < {crucial,(A)} 


h. v-anc-seq,,(A) = v-anc-seq.,(A), i-anc-seq,,(A) = i-anc-seq,(A) 


Proof: Straightforward. | 


Lemma 6.3.1.1.2: Let T € &,. Then the following invariants hold: 
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a. Tis an AAT, i.e. A € vertices, == parent(A) € vertices, 
b. IfA € vertices; and (B,A) € seq and B A, then B € done, 
c. IFA € vertices, and parent(A) € committed, then A € done, 


d. U € active, 


Q 


. If (BJA) € data,, then B € visible, (A) VB € dead, (A) 


fh IFA € committed, and B € desc(A) N vertices, then B € visible (A) V BE 
dead, (A) 


g. If (B.A) € i-data,. then crucial..( B) is defined, and crucial.,( B) € desc(A! B) 
Proof: All are obvious except for (¢) and (f) ((g) follows directly from (e)): 
e) If B = A then the result is immediate. If B # A, then 


Let T = Tov, where v € ¥, can be written as pay, with w = perform A,u. 


Let Tl = Tog, and let T2 = Typ. 


By Lemma 6.3.1.].1c, (B.A) € data,, = BE datasteps.,(x). By precondition Pa.4b for 
perform, B € visible,.(A.x) V B € dead,.,(A,x). 


BE visible,.,(A,x) = BE visible, (A,x) (by Lemma 6.3.1.1.1d), = BE visible (A). 
BE dead, ,(A,x) = B € dead. (A,x) (by Lemma 6.3.1.1.1e), = B € dead (A). 

f) If B = A, then the result is immediate. So assume B € prop-desc(A), and assume B # 
visible,{A). Let C € prop-desc(A) M anc(B) be the highest ancestor of B which is not 


committed. Then parent(C) € committed, = C € done, (by Lemma 6.3.1.1.2c). But cé¢ 
committed, by assumption = C € aborted,. | 


Lemma 6.3.1.1.3: Let (1/11) € ®{ ) and let A € committed,. Then the following are true: 


a. children,,(A) = children,{A) 


b. v-child,,(A) = v-child,(A), i-child,,,(A) = i-child,(A) 


c. v-data,,(A) = v-data, (A), i-data,(A) = i-data (A), 
v-data-anc,..(A) = v-data-anc,{A) 


d. i-data-anc,.(A) < i-data-anc,(A) 
e. v-precedes,(A) = v-precedes (A), vset,,(A) = vset,(A) 


f. i-precedes,(A) < i-precedes (A) 


Proof: 


a) Clearly children,(A) C children, (A). Suppose B € children, (A) - children, {A). (We 


can assume A € accesses.) 


Let Tl = Tpy, where v € ¥ can be written as payppy, with » = commit A, p = create B. 
Let T2 = Typay¥. 


Then A € committed... But precondition Pa.1b requires that A € active, a contradiction. 
b) Follows directly from (a) and I.emma 6.3.1.1.1d 


c) Because any datastep which occurs after perform A,u must follow A in the data ordering, 
v-data, (A) U i-data, (A) = v-data,(A) U i-data,(A). But i-data,{A) C dead (A) by 
Lemma 6.3.1.1.2¢, and dead, {A) € dead.,,(A) by Lemma 6.3.1.1].le. Thus i-data,(A) Cc 
i-data,,(A). 


But v-data,(A) € v-data,,(A) by Lemma 6.3.1.1.1d. It follow directly that v-data,(A) = 
v-data, (A), and i-data,{A) = i-data,,(A). Equality of v-data directly implies equality of 


v-data-anc. 
d) Follows directly from (c) and Lemma 6.3.1.1.1g 


e) Equality of v-precedes,.,(A) and v-precedes (A) follows directly from parts (b) and (c) 
and from Lemma 6.3.1.1.1h. To show that vset,,(A) = vset,(A), we can argue inductively 
since B € v-precedes,(A) =» B € committed, by Lemma 2.3.2. 


f) Follows directly from parts (b) and (d) and from Lemma 6.3.1.1.1h. & 
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Lemma 6.3.1.1.4: Let T € R, A € committed. Then i-precedes,(A) © aborted. 


Proof: Let B € iprecedes (A). 

BE i-anc-seqy = BE done, by Lemma 6.3.1.1.2b, = B € aborted, (since B € 
visible.(A)). 

BE rchild, = BE done, by Lemma 6.3.1.1.2c, = B € aborted, (since B € 
visible, (A)). 

BE i-data-anc; = B = crucial,(b), for b € i-data,(A). By Lemma 6.3.1.1.2g, B is 
defined, = B€ aborted, & 


6.3.1.2 Event Orderings in La 


This section presents some constraints on the ordering of events in valid execution sequences for 
La. In the following lemmas (and in the proofs that follow) we will simplify our notation by referring to 
both "perform A,u" events and "commit A" events as “commit A." This convention causes no 


complications; it requires only that we realize that events written as "commit A" might refer to datasteps. 


Lemma 6.3.1.2.1: Let v € ¥) be a valid execution sequence from La, then -~» is acyclic -- i.e, 


no event can be repeated in a valid execution sequence. 


Proof: Suppose event ¢ could be repeated in a valid execution, v, ie. v = are*beerc € ¥. 


Let Tl = Tae, T2 = ‘Tpaeb. By Lemma 6.3.1.1.1, 


e=createA = AE vertices. 
e=commitA = A€ committed,.. 


e=abortA => AE aborted,.. 


But by the preconditions for events, 
e = create A requires A € vertices, (Pa.la). 
e = commit A requires A € active,, (Pa.2a and Pa.4a). 
e = abort A requires A € active» (Pa.3a). 


Thus no event can be repeated. a 
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Lemma 6.3.1.2.2: Let v € v fhe Toy, AE committed, then 
create A “> commit A 


Proof: v can be written as pay, with m = commit A. 


Let Tl = Toe. 

Precondition Pa.2a (or Pa.4a if A € accesses) requires A € actives, 
= create A € g, 
=> create A = commitA. § 

Lemma 6.3.1.2.3: Let v € v. T= Ty. A € aborted, then 


create A > abort A 


Proof: Similar to the proof of 6.3.1.2.2 above. 


Lemma 6.3.1.2.4: Let v € t. T= Thy. (B.A) € data,, B # A, then 
commit B 7 commit A 


Proof: v can be written as pay, with 7 = commit A (= perform A,u). 


Let Tl = Tog, and let 12 = Typ. 


By Lemma 6.3.1.1.1c, (BA) € data,., 
= BE committed, 


= commit B > ma (=commitA). I 
Lemma 6.3.1.2.5: Let v€ 1, T = Tpv, A € datasteps,, B € v-data,(A), then 


commit A] B a commit A 


Proof: A € datasteps,, = commit A € v. 
Thus v can be written as pry, with w = commit A (= perform A,u). 


Let T1] = Tog, and let T2 = Tha. 
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By Lemma 6.3.1.1.1c, (B,A) € data,.. 
By Lemma 6.3.1.1.2c, B € visible,,(A) VB € dead,.(A). 
If B € dead,.(A) then B € dead,(A), = B€ visible,(A), a contradiction. 


Thus B € visible,,(A), 


= AJBE committed, 


= commit AJB E gq, 
= commitA|B > commitA. & 

Lemma 6.3.1.2.6: Let v € Y. T = Tyv, A € vertices, A’ € prop-anc(A) - {U}, then 
create A’ ~~ create A 

Proof: A € vertices; =» create A € v. 


Thus v can be written as pay, with w = create A. 


Let Tl = Tog, and let 12 = Typ. 


By precondition Pa.]b. parent(A) € active), 


= create parent(A) —> create A (unless parent(A) = {U}). 


The Lemma follows by an obvious induction. | 


Lemma 6.3.1.2.7: Let v € Be T=Tpy. A E vertices, B € v-anc-seq,{A), then 
commit B > create A 


Proof: A € vertices, = create A € v. 
BE v-anc-seq.{A) = 3A’ € anc(A): (B,A’) € seq. 


By Lemma 6.3.1.2.6, create A’ -> * create A. 


v can be written as pay, with w = create A’. 


Let Tl = Tog, and let T2 = Typa. 


By precondition Pa.lc,-B € done, 
=> commit A — create A’ > * create A. &§ 

Lemma 6.3.1.2.8: Let v € ae T= Tyy. AE vertices ,, BE i-anc-seq, (A), then 
abort B ~~ create A 


Proof: Similar to the proof of emma 6.3.1.2.7 above. § 


Lemma 6.3.1.2.9: Let v € 1, T = Tyv, A € committed, B € v-prop-desc,{A), then 
commit B —> commit A 


Proof: A € committed, = commit A € v. Note that since A has a proper descendant, A € 
accesses. Assume that B € v-child (A); the Lemma follows from this case by an obvious 


induction. 


vcan be written as pa, with w = commit A. 


Let T] = Tog, and let 12 = Top. 
But B cannot be created after A has committed, so B € vertices. 


By precondition Pa.2b, B € done,,, 
= commit B € g, 


= commit B =~ commit A. § 
Lemma 6.3.1.2.10: Let v € 1 T=T, ov AE committed, BE i-child (A), then 
abort B 1 commit A 


Proof: Similar to the proof of lemma 6.3.1.2.9 above. JU 


Lemma 6.3.1.2.11: Let v€ ¥,,T = Thy, A.B € committed, B € v-precedes>(A), then 


commit B 4 commit A 


= 95 , 
Proof: We show C € v-precedes (B) = commit Cc > commit B. The Lemma follows by 
an obvious induction. 


C€ v-precedes (B) = 
C € v-anc-seq,(B) V C € v-child,(B) V C € v-data-anc,(B). 


IfC Ee v-anc-seq,(B), then 


commit C + createB by Lemma 6.3.1.2.7. 


But create B —> commit B_ by Lemma 6.3.1.2.2, 


= commit C ~ commit B. 


IfC Ee v-child.-(B). then 


commitC — commit B by Lemma 6.3.1.2.9, 


IFC E v-data-anc,(B), then 
C = Blc, wherec € v-data,(B), 
= commitC — commit B by Lemma 63.1.2.5. 8 
Lemma 6.3.1.2.12: Let v € %,T = Tyy, A.B € committed, B € vset7(A), then 


commit B se commit A 


Proof: Immediate corollary of Lemma 6.3.1.2.11. | 


Lemna 6.3.1.2.13: Let v € ¥. T = Tyy,A € committed,. BE i-precedes,(A), then 
create B >» commit A 


Proof: B € i-precedes,(A) = 
BE i-anc-seq,(A) V B€ i-child,(A) V B € i-data-anc,{A). 


If B € i-anc-seq,{A), then abort B > create A by Lemma 6.3.1.2.8, and 
create B —» abort B, createA -» commit A, 


= create B = commit A. 


IFBE ichild,(A), then abort B —>» commit A by Lemma 6.3.1.2.10, and create B -> 
abort B, 


= create B ae commit A. 
IFBE i-data-anc.{A), then B = crucial,,(B’) for some (B’,A) € data,. Thus B € desc(B’). 


But by Lemma 6.3.1.2.4, commit B’ ~ commit A, and by Lemma 6.3.1.2.6, create B mg 
create B’ —> commit B’ 


= create B rg commit A. | 


Lemma 6.3.1.2.14: Let v € t. T= Toy, A,B € committed, BE vset,(A), CE 
i-precedes,(B), then 


create B - commit A 


Proof: Immediate corollary of Lemmas 6.3.1.2.12 and 6.3.1.2.13. | 


6.3.2 Version-Compatibility in La 


Lemma 6.2.4.6 states that if T is a reachable AAT in Ll, and A € vertices, then viree,{A) is 
version-compatible. In this section we develop two lemmas which will be used in the proof of Lemma 
6.2.4.6. First we show that if T is any AAT which is version-compatible, then any restriction of T to a 
v-data,-closed set is also version-compatible. We then show that any reachable tree in La is 
version-compatible. We will show in a later section that for any reachable tree, T, in L1, viree,{A) isa 


(backed up) restriction of T to a v-data,-closed set, which will complete the proof of Lemma 6.2.4.6. 


Lemma 6.3.2.1: Let T be an AAT, V C vertices,, where V is anc-closed and v-data,-closed. 


If T is version-compatible, then T]V is version-compatible. 
Proof: Let S = T[V. Note that S is an AAT since V is anc-closed. Let A € datasteps,(x). 
We must show that label,(A) = result(x,r), where r = <<v-data,(A); data,>>. 


By definition, label,(A) = label,{A). 


But T is version-compatible, 
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= label..(A) = resull(x,r’), where 


—— v-data (A); data,>>. 


Thus it suffices to show r = r. But data, € data, thus it suffices to show sct equality. It is 


obvious that rCr. 


So suppose B € r’, B# A, 
= BE visible (A) A (B.A) € data, 
= BEV, since V is v-data,-closed and A € V, 
= BE visible (A) A (B,A) € datag (since V is anc-closed), 
= BEr. i 


Lemma 6.3.2.2: Let TER a !hen Tis version-compatible. 


Proof: Let A € datasteps,(x). We must show that u (= label, (A)) = result(x,s), where s = 
<<v-data (A); data>>. 


Let 'T = Tpv, where v € ¥ can be written as pay, with w = perform A,u. 
Let Tl = Tyg, and let T2 = Typz. 
By precondition Pa.4c, u = result(x,s'), where s’ = <<visible, (A,x); data, >>. 


Thus it suffices to show s = s’, and since data,, € data, it suffices to show set equality. 


First, let B € s. 

(B.A) € data,, but A € datasteps,, = (B.A) € data,,, 
= BE datasteps,., 
= BE datasteps (x). 


By precondition Pa.4b, B € datasteps,(x) = BE visible, (A,x) V B € dead.,,(A,x). 


But if BE dead,.,(A.x), then B € dead, {A,x) by Iemma 6,.3.1.1.le, 


=B¢ visible (A), which is a contradiction. 
Thus B € visible; ,(A,x), = BE s’ 


Conversely, suppose B € s’. We know B ¥ A since A € datasteps,,. 
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(B,A) € data, =» (B,A) € data,. 
B€ visible, (A.x) = B € visible, (A,x), = B €s. 
6.3.3 Properties of Aborts Sets in La 


In this section we present some properties of aborts sets for reachable trees in La. The first 
lemma is not strictly a property of aborts scts, but it justifies use of the recursive form of ABORTS in 


inductive proofs, so we include it here. 
Lemma 6.3.3.1: Let T€%,, A € vertices, Then A ¢ v-precedes.-(A). 


Proof: Let T = Tpv for some v € ¥,. Suppose A € v-precedes 7 (A). 
By Lemma 6.3.1.2.11, A € committed, and 


commit A = commit A. 


But ~ is acyclic for v€ v. so this is impossible. | 


Lemma 6.3.3.2: Let T € %,, A € committed, (A,B) € seq, A # B, then 
ABORTS, (A) O desc(B) = B 


Proof: Let T = Thy, for some v € 1 
ABORTS,(A) = _Uji-precedes,(C) . 
ce vset (A) 


Let D € i-precedes,(C), for some C € vset,{A). We show 1D € desc(B). Since A € 


committed, create D ~~ commit A, by Lemma 6.3.1.2.14. 


But if D € desc(B), then commit A > create D, by Lemma 6.3.1.2.7. But ~~ must be 


acyclic, so we have a contradiction. | 


Lemma 6.3.3.3: Let (TTT) € gp (7) and Iet A € committed,. Then 
ABORTS,,,(A) < ABORTS,{A) 


Proof: ABORTS...(A) = Ui-precedes,.(B 
A) B Caw ) 
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By Lemma 6.3.1.1.3e, vset, (A) = vset {A) 
= ABORTS,,(A) = avi Precedes,,(B) 


€ vsety(A) 


But A € committed, and v-precedes;-(A) € committed, by Lemma 2.3.2, 


= vset (A) C committed ,. by definition of vset,. 


But B € committed, = i-precedes,.(B) < i-precedes,(B), by Lemma 6.3.1.1.3f, 


= ABORTS;(A) < | ie precedes (B) (using Lemma 2.2.1.1), 
€ vset (A 
= ABORTS,,(A) < ADORTS 7A). fl 


Lemma 6.3.3.4: Let (1,11) € &), and let A € vertices,. ‘Then 
SEQ-ABORTS, (A) < SEQ-ABORTS, (A) 
Proof: SEQ-ABORTS,,(A) = i-anc-seq,,(A)U UABORTS,,(B 
ae pp) een ann is Cveeean y 


= iranc-seq, fA) U Uanorts (B) , by Lemma 6,3.1.1.1h. 


B € v-ane-seq. fA} 
But B € v-anc-seq.(A) = BE committed, 
= ABORTS.,(B) < ABORTS,(B), by Lemma 6.3.3.3. 


The lemma follows directly using Lemma 2.2.1.1. | 


6.4 Proof of Possibilities Map for hy) 


We now return to the task of showing that hj, is a possibilities map. First we must prove 


Lemmas 6.2.4.5, 6.2.4.6, and 6.2.4.7. 
We first state an obvious Iemma for L.1: all reachable AAT’s are in ANC-ABORT. 
Lemma 6.4.1: Let T € DK). Then T € ANC-ABORT. 


Proof: Let T = Tv, for some v € Y,. Ifv # A, then T = Te for some e € &. TE PRE, (e), 
and by the ANC-ABORT precondition for e, T € ANC-ABORT. If v = A, then T = To 
which is trivially in ANC-ABORT. 8 
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We will use this ANC-ABORT property, together with results from La, to prove Lemmas 
6.2.4.5, 6.2.4.6, and 6.2.4.7. 


Let Ja denote the property of T which is the conjunction of the properties stated in Lemmas 6.3.1.1.2, 
6.3.1.1.4, 6.3.2.2, 6.3.3.1, and 6.3.3.2. (Recall that all invariant abbreviations are cross-referenced to 


lemmas in Appendix [.) 


Let Ja denote the pair-property of T which is the conjunction of the pair-properties stated in Lemmas 


6.3.1.1.1, 6.3.1.1.3, 6.3.3.3, and 6.3.3.4. 
Lemma 6.4.2: [a is invariant in L1, and Ja is pair-invariant in L1. 


Proof: h,, is a possibilities map by Theorem 6.3.1. But h,, fixes T. Since Ja is invariant for T 
in La, Ja is invariant for T in L] by Lemma 4.2.4.3.5. Similarly since Ja is pair-invariant for 


T in La, Ja is pair-invariant for T in 1.1, by Lemma 4.2.4.3.5. 8 


Let Sa denote the property of event sequences which is the conjunction of the properties stated in 


Lemmas 6.3.1.2.] through 6.3.1.2.14. 
Lemma 6.4.3: Let v € 7. Then Sa holds for v. 


Proof: Since hi, is a possibilities map, it is a valid interpretation, by Lemma 4.2.2.5. Thus 
hy,(v) € ‘.. But h, . is the identity map on events, so h, atv) = v. Since Sa holds for all event 


sequences in 1, Saholds forv. & 
Now we prove a preliminary lemma for L1, which shows that tracing back the visible precedence relation 
from any action cannot lead to an ancestor of that action. 

Lemma 6.4.4: I.ct T € &,, and let A € vertices, Then anc(A)M v-precedest (A) = @. 

Proof: Suppose B € anc(A) M v-precedes 7(A). 


By Lemma 2.3.2, B € v-precedes., (A) = BE committed. By Lemma 6.3.1.1.2f A € 
desc(B) = AE visible (B), or A € dead, (B). 


IfAE visible, B), then by Lemma 6.3.1.2.9, 
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commit A de commit B. 


But B € v-precedess(A), AE committed, = commit B —> commit A, by Lemma 


6.3.1.2.11, a contradiction. 


IFA € dead,(B), then Iet C be the lowest (in ancestor order) action in anc(A) MN v-desc,(B). 
Clearly C € vset,(B) = CE vset,(A), by Lemma 2.3.3a. A € prop-desc(C) since A ¢ 
visible, (B). Let D = CJA. 


But D € committed, since otherwise 1D would be visible to B, contradicting our choice of C 
as the Jowest visible descendant of B which is an ancestor of A. 

= DE aborted, (by lemma 6.3.1.1.2c), 

= D€i-child(C) = D€ ABORTS(C). 


But C € vset,(A) = ABORTS,(C) C ABORTS,(A), by Lemma 6.2.1.5, 
= D€ ABORTS,{A). 


But D € anc(A), which contradicts T € ANC-ABORT. | 


6.4.1 Proof of Lemma 6.2.4.5 


Let T € BA € vertices, Let S = vtree,(A). By Lemma 6.4.4, anc(A) Nn v-precedes7(A) = Z, 
= prop-anc(A) N vset,(A) = @ (since vset,(A) = v-precedes; (A) U {A}), 


= S is a view tree for A in T, by Lemma 3.5.2. | 
6.4.2 Proof of Lemma 6.2.4.6 


Let T € B). A€ vertices,. LetS = viree,(A). By Lemma 6.3.2.2, T is version-compatible. 


Let W = vset,{A) U prop-anc(A). By Lemmas 6.4.4 and 3.5.2, viree;(A) = (T|W)//A. W is 
v-data,-closed by Lemma 2.3.3c. By Lemma 6.3.2.1, T]W is version-compatible. But since backing up 
proper ancestors of A to active status cannot affect the labels of accesses of S, S is version-compatible. 
a 
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6.4.3 Proof of Lemma 6.2.4.7 
Let T € B,, AE vertices. 


We show that if S = virec,(A), then Seq, U sibling-data, is acyclic. Let V = vset,(A), W= vset,(A) U 
prop-anc(A). By Lemmas 6.4.4 and 3.5.2, S = (T]W)//A. Thus data, C data,, seq, C seq, The proof 


will be by contradiction: 


Let (Ay, Ags A,) be a cycle in SC. U sibling-data, (with n > 2). . 
then (Aj, A,.... A.) is acycle in seq, U sibling-data,, and A; € W. 


Let P be the common parent of {A,}. 


We will use the convention that subscripts are taken modulo n, i.c. we regard A, = Aj. 


First we prove a preliminary Jemma: 
Lemma 6.4.3.1: If A € desc(A,) U desc(A; , ,), then A, € v-precedes7(A; , )). 


Proof: We show A, € wset (A; , ). Since A, # A, , the I.emma follows directly. 


i+) 


(A;, A; , 1) € sedg 
= (A, A,, p € seq,, and A;, A 

But A € desc(A;) U desc(A 
= A; € v-seqyfA; , 1), 


=A, € vset,(A, , :)- 


€ visible,(A) (since vertices, C visible,{A).) 
€ visible;{P), 


i+] 


i+D = Ap Aig) 


(A;, A, , ,) € sibling-datag 
= Ja, € desc(A;). a, 5 
vertices, € visibic,{A).) 
But A € desc(A;) U desc(A; , ,) = a, a,,, € visible,(P), 
= a, € visible,(a; , 1), = a, € v-data,(a, p> 
=A, € v-data-anc,(a, ap = A; € vset{a; +p: 


J 


€ desc(A, , 4): (4; 4, 1) € data,,and a, a.) € visible,(A) (since 


But a, , , € visible,(P) 
= a, €v-desc(A,,,),= a, € vset (A; 5), 
= A Evset(A,,,). 8 
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Proof of Lemma 6.2.4.7 (continued): 


Suppose that A € desc(A,) Wi; then by Lemma 6.4.3.1, A,€ v-precedes (A, 4p Vi, 


=A, € v-precedes + ( A.) (since the A; form a cycle), which contradicts Lemma 6.3.3.1. 
Thus A € desc(A,), for some i. Assume without loss of generality that A € desc(A, ). 


Since n > 2, A, # A,. But A, € vertices, and A, € anc(A), 


=A, € vprecedes (A). 


A, € vprecedes (A) 
= ABORTS,(A,) C ABORTS, (A), by Lemma 6.2.1.5. 


Since (A,,A; , ) € seqg U sibling-datag, we have two following cases: 
1, (A.A) € segg 
2. (A,.A4) € sibling-datag 

Case 1: (A,.A,) €seq = A, € donc.,, by emma 6.3.1.1.2b. 


If A, € aborted, then A, € i-anc-seq,(A,), = A, € ABORTS,{A,), 
= A, € ABORTS,(A), which contradicts ‘T € ANC-ABORT. 


IfA, € committed,, then A, € v-anc-seq.,(A,), =A€ v-precedes, (A,), 


= A, € v-precedes +-( A), which contradicts ].emma 6.4.4. 


Case 2: (A, ,A;) € sibling-data, 
= 3b, € desc(A), b, € desc(A,): (b,, b,) € data,. 


db, € visible,(A) =»b, € visible.(P), = b, € v-desc,(A,), = b, € vset,(A,), 
= bd € v-precedes (A), 
= ABORTS.(b,) € ABORTS,(A). 


Case 2a: b, € visible,(b,), 
= bE v-data,(b,), = A, € v-data-anc,{b,), = A, € v-precedes,{b,), 


= A, € v-precedes;(A), contradicting Lemma 6.4.4. 
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Case 2b: b, € visible,(b,) 
= b i E i-data(b,) (See Fig. 6.1.) 


Let B = crucial..(b * By Lemma 6.3.1.1.2g, B is defined, and B € desc(A,). 
BE i-data-anc,(b,) = BE ABORTS,(b,), = BE ABORTS,{A). 


But B € anc(A), since d, € visible.,(A), contradicting T € ANC-ABORT. | 


6.4.4 Proof that hj is a Possibilities Map 
We now have all the facts needed to show that hy isa possibilities map: 
Theorem 6.4.4.1: hj isa possibilities map. 


Proof: h,, preserves initial states by Lemma 6.2.4.1. hj, preserves transitions by Lemma 
6.2.4.2. We have proven lemmas 6.2.4.5, 6.2.4.6, and 6.2.4.7; thus every reachable state in L1 
is view-serializable (Theorem 6.2.4.4). Thus hy, preserves preconditions by Lemma 6.2.4.3. 


By Lemma 4.2.2.6, hj, is a possibilities map. @ 


Fig. 6.1. Case 2b, Lemma 6.2.4.7 
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6.5 Level 2 Algebra and Mapping h,, 


At Level 2 we replace the ANC-ABORT condition with a precondition on perform events. 
Otherwise everything in Level 2 is identical to Level 1. 


12 = ,, ,,,, 7,) 
€, = &, = {create A, commit A, abort A, perform A,u}. 
2, = Z), the set of all augmented action trees. 
o,=0,= To 


7», the transition relation, is obtained by deleting the ANC-ABORT preconditions Pl.1d, P1.2c, P1.3b, 


and P1.4d, and by inserting a new precondition for perform events: 
(P2.4d) B€ visible ,(A,x) =» anc(A)M ABORTS,(A1B) = 8 
We define the trivial mapping h,,: L2 — La as the identity map on states and events. 
Theorem 6.5.1: h,, is a possibilities map. 


Proof: Because the ANC-ABORT conditions do not appear in algebra La, every precondition 
at Level A also appears at Level 2 (in addition, Level 2 has precondition P2.4d). Thus h,, 
preserves preconditions. All! transitions are identical in La and L2; thus h,, preserves 
transitions. Initial states are identical in L2 and La; thus h,, preserves initial states. By 
Lemma 4.2.2.6, h,, is a possibilities map. 8 


Since h,, fixes T, all invariants (and pair-invariants) for La are invariant (or pair-invariant) for 


Lemma 6.5.2: Ia is invariant in L2, and Ja is pair-invariant in L2. 


Proof: Since h,, is a possibilities map which fixes T, and la is invariant for T in La, Ia is 
invariant for T in L2 by Lemma 4.2.4.3.5. Similarly since Ja is pair-invariant for T in La, Ja 
is pair-invariant for T in L2, by Lemma 4.2.4.3.5. 8 
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6.5.1 Specification of Mapping h,, 


We define the trivial mapping h): L2 -» L1 as the identity map on states and events. We must 


show that this mapping is a possibilities map. 
Lemma 6.5.1.1: h., preserves initial states. 


Proof: Trivial, since 0, = 0}. | 


Lemma 6.5.1.2: h,, preserves transitions. 


Proof: Trivial, since all transitions are identical in L2 and Ll. & 


We must also show that h,, preserves preconditions. We use the following lemma to reduce this 


problem to the ANC-ABORT condition on reachable states in L2: 


Lemma 6.5.1.3: Suppose that for all T € %,, T € ANC-ABORT. Then h,, preserves 
preconditions. 


Proof: It is obvious that h,, preserves all preconditions except for the ANC-ABORT 
conditions, since all other preconditions appear at Level 2. We must verify that the 
ANC-ABORT conditions hold; these conditions state that the next siate is in ANC-ABORT, 


ie. 

T € PRE,(e)N&,,h,(T)€%, = h,,(T)h,,(e) € ANC-ABORT 

But h, (T)h,, (e) is just Te, since h,, is the identity mapping. Thus we will must show 

T € PRE,(e)N %,, h,,(T) €%, =» Te € ANC-ABORT. 

But T € PRE,(e) nN R, = Te€ R.. Thus it suffices to show that all reachable states in L2 


are in ANC-ABORT. §@ 


Our main result for L2 is thus that all reachable states are in ANC-ABORT, which will imply 
that h,, preserves preconditions (and is thus a possibilities map): 
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Lemma 6.5.1.4: Let T € %,. Then T € ANC-ABORT. 
Proof: Take A € vertices,. We show anc({A) MN ABORTS.{A) = 


The proof uses induction based on the recursive form of ABORTS: 


ABOR A) = i-precedes, (A) U ABOR B 
TS,(A) = i-precedes,(A) neat {B) 


Recall that by Lemma 6.5.2, we can use any results from Ia or Ja since we have shown that 


these properties are invariant n 12. 
Thus Lemma 6.3.3.1 (in Ia) justifies the use of the inductive proof method. 


Assume the Lemma holds for all B € v-precedes,(A): anc(B) M ABORTS,(B) = 
First we show i-precedes,{A) M anc(A) = © 


BE i-anc-seq,(A) = (B,A’) € seq, for some A’ € anc(A), B # A’, 
=> B € anc(A). 


BE i-child (A) = B€ children(A), 
= B ¢ anc(A). 


B€ i-data-anc,(A) = 3B’ € i-data,(A): B = crucial,(B’). 
But by Lemma 6.3.1.1.2g, crucial, {B’) € desc(A] B) 
= B € anc(A). 


Now we show B € v-precedes,(A) =» anc(A) M ABORTS,(B) = 


a. B€ v-anc-seq,(A) = (B,A’) € seq, for some A’ € anc(A), B # A’, 
=> ABORTS,(B) NM desc(A’) = 2, by Lemma 6.3.3.2. 
But by Induction Hypothesis, ABORTS,(B) M anc(B) = @ 
But anc(B) = {B} U proper-anc(A’), since (B,A’) € siblings, 
=» anc(A) € desc(A’) U anc(B), 
= ABORTS,(B)/N anc(A) = 2. 


b. BE v-child (A) = ABORTS,(B) NM anc(B) = ©, by Induction ees 
But B € children(A) = anc(A) C 7 as 
=+ ABORTS(B)/N anc(A) = 
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c. B€ v-data-anc,(A) = 3B’ € v-data,{A): B = AB, 
= AE datasteps 


Let T = Tyv, where v € 1 can be written as pwy, with w = perform A,u. 
Let T] = Tog, and let T2 = Type. 


Let A,B’ € datasteps(x). 
By Lemma 6.3.1.1.1c, (B’,A) € data, 


= B’ € datasteps,.,(x). 
By precondition P2.4b, B’ € visible,,(A,x) V B’ € dead,,(A,x). 


BE v-data (A) = BE visible,,(A,x), 

= anc(A) M ABORTS,,(A{B’) = @, by precondition P2.4d (the orphan 
detection precondition), 

= anc(A) M ABORTS,,,(B) = 2. 

But ABORTS,(B) < ABORTS,,(B), by Lemma 6.3.3.3 (since B € 


committed), 
= ABORTS,(B)/M anc(A) = 8, byLemma22.1.1d. § 


6.5.2 Proof that h,, is a Possibilities Map 
We now have all the facts needed to show that h., is a possibilities map: 
Theorem 6.5.2.1: h,, is a possibilities map. 


Proof: h,, preserves initial states by Lemma 6.5.1.1. hy preserves transitions by Lemma 
6.5.1.2. Since we have shown that all reachable states in L2 are in ANC-ABORT (Lemma 
6.5.1.4), h., preserves preconditions by Lemma 6.5.1.3. By Lemma 4.2.2.6, hy is a 


possibilities map. & 
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7. Partially Localized Model 


In a distributed event-state algebra all preconditions for events are localized to the nodes at 
which those events occur (or to the message buffer). The Level 2 model is defined in terms of a single 
global state, the global AAT. As we move towards a distributed model, we partition the state into distinct 
components, and we attempt to localize preconditions to these components. At Level 3, we define an 
abstract set of locations, and we give each location a (local) state. This local state will consist of a UAS at 
each location, and an ordering on datasteps at each object. (The data ordering in an AAT is already 
“localized,” since data, individually orders datasteps at each object.) 


These locations are simply containers for information; they need not correspond directly to 
physical locations (nodes) at the lower levels. In a later chapter we will construct a mapping from a 
distributed model where state is partitioned among nodes, to this localized model where state is partitioned 
among locations, Essentially several abstract locations can reside at a single physical node. One 
advantage of using abstract locations at this higher level is that we need not be concerned with how 


information is physically distributed. 


We can think of locations as “abstract nodes." We will consider each action and object to be a 
separate location; the information at these locations will represent the view at that action or object. It will 
be convenient to allow other (unspecified) locations as well. The events at this level will be either “local 
steps," which are conceptually local to a particular location, or “communications steps,” which transfer 
information from one location to another. Transfer of information is instantaneous (i.e. there is no analog 
to the message buffer at this level). (We will show later that we can model communications delays by 
regarding “message slots” in the message buffer as locations. Thus we do not specify the complete set of 
“locations” at this level; locations can be viewed abstractly as any information holders.) 


We show that it is straightforward to localize all preconditions except for the orphan detection 
condition (precondition P2.4d: B € visible {A,x) =» anc(A)M ABORTS{A]B) = ®). 
7.1 Level 3 Algebra 
L3 = (&,, 2; Oy 7) 


State Space: 
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Let tloc (the "tree locations") = (act - {U} - accesses) U obj. 
Let loc be a set of “locations,” where tloc C€ loc. 


(We exclude U from tloc because it is a virtual action; thus we associate no information with it directly. 
Also, we exclude accesses from tloc because we regard them as being coupled to their objects for 


information.) 
2, = {<T,L)}, where the components are 


T - the “global state”, an augmented action tree (as in L2) 
L - the “local state", where L: loc ~ UAS 


Notation 


If "prop" is some property (function, relation, etc.) defined on UAS, then we denote PTOPy (7) by 
"@prop,[a]" (for example, visible, ; 4)(A.x) = @visible, [aKA,x)). 

The "@" symbol flags components of the /ocal state (as opposed to components of the global AAT). We 

also use the “@" symbol to distinguish communications events from “local” events, since the 


communications events only affect the local state. 
We further abbreviate by writing @prop, [A] for @prop, [x] when A € accesses(x) 
6, = (Ty, Ly 


Tp - the trivial AAT, as in L2, 
L,(«) = TT. - the trivial UAS, Va € loc. 


Events: 


Events create, commit, abort, and perform are localized to individual locations (except for the orphan 
detection precondition). We regard an action as being created at the location of its creator, and 
committed or aborted at its own location (or the location of its object if it is an access). (Recall that for A 
# U, creator(A) = parent(A) unless A € top, and creator(A) = A for A € top.) 
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In addition to the "local" events (create, commit, abort, perform), we introduce “communications events” 
to move information from one location to another. The “source” of information is arbitrary for each 
event: communications events are parameterized by a single location: the destination of the information 
transfer. At lower levels we will parameterize communications events by the sender of information as 


well. 


The communications events are as follows: 


@createfa] A -- create action A at location a 
@commit{a] A -- commit action A at location a 
@abori{a] A -- abort action A at location 


The transition relation is defined so that each communications event is idempotent, i.e., the effect of a 
communications event which occurs multiple times is the same as the effect of this event occurring a 
single time. Idempotency "filters out" duplicate communications events. 

Transition Relation: 


Lete € &,, <T.L> € 2, <T,L>e = <TLLD. 


1. create A (A € act - {U}) 
| PRECONDITIONS: 
a. A € @vertices, [creator(A)] 
b. parent(A) € @active, [creator(A)} 
c. (B,A) € seq, B# A = B € @done, {creator(A)] 
TRANSITIONS: 
a. vertices, + vertices, U {A} 
b. status,,(A) + ‘active’ 


c. @vertices, ,[creator(A)] + @vertices, [creator(A)] U {A} 
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d. @status, ,[creator(A)KA) + ‘active’ 


2. commit A (A E act - {U} - accesses) 
PRECONDITIONS: 
a. A € @active, [A] 
b. @children, [AKA) € @done, [A] 
TRANSITIONS: 
a. status,(A) + ‘committed’ 


b. @status, ,[AKA) — ‘committed’ 


ul 


3. abon A (A € act - {U}) 
PRECONDITIONS: 
a. A € @active, [A] 
TRANSITIONS: 
a. status,,(A) + ‘aborted’ 


b. @status, AKA) + ‘aborted’ 


4. perform Aw (A € accesses(x), u € values(x)) 
PRECONDITIONS: 
a. A € @active, [x] 
b. B € @datasteps, [xKx) = B € Gvisible, [xKA,x) V B € @dead, [xKA,x) 
c. u = result(x,s), where s = «<@visible, [xKA,x); O(x)>>, and O = order(T). 
d. B € @visible, [xKA,x) =+ anc(A)M ABORTS,(A]B) = 8 
TRANSITIONS: 


a. status, (A) + ‘committed’ 
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b. @status, ,[xKA) — ‘committed’ 
c. label,.,(A) —-U 


d. data,, + data; U {(B,A): BE datasteps(x)} U {(A,A)} 


5. @createfa] A (A € act - {U}, a € loc) 
PRECONDITIONS: 
a. A € @vertices, [8], for some B € loc 
TRANSITIONS: 
a. @vertices, j[a] — @vertices, [a] U {A} 


b. A € @vertices, [a] = @status, [aK A) + ‘active’ 


6. @commitla] A (A € act - {U}, @ € loc) 
PRECONDITIONS: 
a. A € @committed, [B], for some B € loc 
TRANSITIONS: 
a. @vertices, fa] — @vertices, [a] U {A} 


b. @status, ,[aKA) + ‘committed’ 


7. @abortla] A (A € act - {U}, @ € loc) 
PRECONDITIONS: 
a. A € @aborted, [8], for some B € loc 
TRANSITIONS: 
a. @vertices, ,[a] — @vertices, [a] U {A} 


b. @status, ,[aKA) + ‘aborted’ 
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7.2 Specification of Mapping h,, 


We define a (single-state) mapping from L3 to L2, h,,: L3 + L2. (We abbreviate “h,," as “h" 


in this chapter.) 

State Mapping 

h: 2, — &, is defined by h(T,L>) = T, V<T,L> € 2. Thush fixes T. 
Event Mapping 

h: 6, > &, is defined by 


h: create A — create A 
commitA — commitA 
abort A —> abortA 
perform Aju -> perform A,u 


Gcreatefa] A — A 
@commit{a] A> A 
@aborf{aJA — A 


7.3 Level 3 Invariants 
The following simple pair-invariants are analogous to the Level A pair-invariants from Lemma 6.3.1.1.1: 
Lemma 7.3.1: Let (<T.L>,<T1,L1>) € %{?), Then the following pair-invariants hold (Let a 


€ loc, x € obj, A,B € act): 


a. @vertices, [al] C @vertices, [a], @committed, [a] € @committed, ,[a], 
@aborted, [a] € @aborted, [a], @done, [a] C @done, ,[a] 


b. Bis dead in L{a) = Bis dead in Ll(a); Bis live in L1(a) = Bis live in L{a) 


c. @visible, [aKA) € @visible, [aKA), @dead, [aKA) € @dead, [aKA) 


Proof: Straightforward. | 
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The following invariants relate local states to the global state. Essentially each local State 
represents a “partial view" of the true global state. We show these invariants relative to mapping h: Since 
h fixes T, all invariants and pair-invariants for T in L2 can be applied to the proofs (by Lemma 4.2.4.3.6). 


Recall that we have shown in Lemma 6.5.2 that invariants la and Ja from Level A are invariant in L2. 


Lemma 7.3.2: Let <T,L> € %,. Then the following are invariant relative to h. (Let a € loc, 
x € obj, A,B € act): 


a. A € vertices, - {U}_ = A € @vertices, [creator(A)}; 
U€ active, A U € @active, [a] (Va € tloc) 


b. A € committed, = A € @committed, [A] 

c. A € aborted; <= A € @aborted, [A] 

d. A € done, = A € @done, [A] 

e. A € datasteps,(x) < A € @datasteps, [xKx) 

f. @vertices, [a] C vertices,, @committed, [a] C committed, @aborted, [a] C 
aborted,, @done, [a] € done, 
(Note: @active, [a] € active, does not necessarily hold.) 


g. AE @active, [A] = A € active, 
(Note: not necessarily conversely) 


h. B € @visible, [aKA) = B € visible,(A) 
i. B € Gdead, faKA) = B € dead (A) 


j. (BA) € data, (A,B € accesses(x)) =» 
BE visible,(A) <= B€ @visible, [xKA) 


k. (B,A) € data, (A,B € accesses(x)) = 
BE dead (A) «= B € @dead, [xKA) 


1. (B,A) € data, (A,B € accesses(x)) =* 
B € Gdead, [x(A) = {crucial,(B)} < @aborted, [x] 


Proof: 
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a IFA #U,AE vertices,, then there must have been an event create A, which 
also has the effect of placing A in @vertices, [creator(A)]. Using Lemma 7.3.1a, 
we conclude that A € vertices, =» A € @vertices, [creator(A)]. 


Conversely, if A € @vertices, [creator(A)] then there must have been an event 
create A, or @create[creator{A)] A. Consider the first such event. If it is create 
A then A € vertices,. Now suppose there were an event @create[a] A, for some 
a, that preceded create A. Let ¢ be the first such event, and let the state 
immediately before the execution of e be <T1,L1]>. By the precondition for e, A 
€ G@vertices, [8] for some B. But if A € @vertices, ,[B], then either B = 
creator(A) and create A precedes e, or an event f = @creatc[B] A precedes e. 
Both cases contradict our choice of e. 


U € active, by Lemma 6.3.1.1.2d. To see that U € @active, [a], note that U € 
@active, {a}, but no event can change U’s status. 


b. Similar to (a). 
c. Similar to (a). 
d. Follows directly from (b) and (c). 


e. A € datasteps,(x) «= A € committed, 
«= A € Ecommitted, [x] by (b), 
«= A € @datasteps, [xKx). 


f. We argue @vertices, [a] € vertices;; the other cases are similar: If a = 
creator(A), then the result follows directly from (a). Otherwise we can show 
@vertices, [a] € vertices, by induction on the number of events in a valid 
sequence generating <T,L>. In the initial state @vertices, fa] = vertices, = {U}. 
But vertices, [a] can only increase when an event @create{a] A occurs, which 
requires as precondition A € @vertices, [8] for some B (where <T1,L1 is the 
state before this event occurs). By induction hypothesis, A € @vertices, ,[8] = 


A € vertices, = A € vertices,. 


g. A € @active,[A] = A € vertices, from (f). If A € done,, then A € 
@done, [A] by (d) -- a contradiction. Thus A € active, 


h. B € @visible, [aKA) = anc(A) M prop-desc(Ica(A,B)) € @committed, [a]. 
But @commitied, [a] € committed, by (A, 
= BE visible,(A). 


i, B € @dead, [a A) = anc(A) N prop-desc(Ica(A,B)) M @aborted, [a] # 2. 
But @aborted, [a] € aborted, by (f), 


= BE dead,(A). 
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j. B€ @visible, [xKA) = B € visible fA) by (h). 
BE visible (A) = B € visible(A,x), 
= B € @datasteps, [xKx). Assume B ¥ A (otherwise the result is obvious). 


Let <T,L> = 0,V,V € %. 
Let v = pay, where # = perform A,u, and let <T1,LID = o,9. 


(B.A) € data, =» (B.A) € data,,, by Lemma 6.3.1.1lc, = B € datasteps,,, 
= B € @datasteps, ,[xKx), by (e), 
= BE€ @visibie, [xKAx) V BE @dead, ,[xKA.x) by P3.4b. 


But B € @dead, ,[xKA,x) => BE dead,,(A,x) (by (i), = B € dead,{A,x) -- a 
contradiction. Thus B € @visible, [xKA.x), 
= BE @visible, [xKA,x) (using Lemma 7.3.1c). 


~ 


. Similar to (j) above. 


ry 


. B€ Gdead, [xA,x) = anc(B) M @aboried, [x] ¥ 9, 
=» @crucial, [xKB) is defined. 
But anc(B) M @aborted, [x] € anc(B) M aborted, by (f, 
=> crucial, (B) € desc(@crucial, [xKB)), 
= {crucial,(B)} < @aborted, |x]. 


7.4 Proof of Possibilities Map for h,, 


We now show that h is a possibilities map. Let 13 denote the conjunction of all properties in 
Lemma 7.3.2. We will show that h is a possibilities map relative to 13. 


Lemma 7.4.1: h preserves initial states. 


Proof: Immediate since h(<T, gly) =Ty & 


Lemma 7.4.2: h preserves transitions relative to 13. 


Proof: We must show that if <T,L> € PRE) NR, NM 13, and h(<T,L>) € PRE,(h(e)) 
%,, then h(<T,L>e) = h(<T,L>)h(e). 


But h(<T,L>) = T, so we must show the following: 
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If<T,L> € PRE,(e) N'%, M13, and T € PRE,(h(e)) MN %,, and <T,L>e = <TLLD, then T1 
= (Th(e). 


For the communications steps in L3 (e = @create, @commit, @abort), T is not altered, and 
h(e) = A,so Tl = T = (T)h(e). 


For the local steps (ec = create, commit, abort, perform), it is easily verified by inspection that 
the effects of events on T are identical in L2 and L3. But h(e) = e, so T1 = (T)h(e) = (The. 
i 


Lemma 7.4.3: h preserves preconditions relative to 13. 


Proof: We must show that if <T,L> € PRE,(e) n h, M 33, and h(<T,L>) € ,, then 
h(<T,L>) € PRE,(h(c)). 


Since h(<T,L>) = T, we show 
<T,L> € PRE, (ec) NN %, 13, TE &, = T € PRE,(h(e)). 


For communications steps, e, h(e) = A, and preservation of preconditions follows vacuously. 


For local steps, h(e) = e. We prove preservation of preconditions for each local step in turn: 


1. create A 


a. P3.la = A € @vertices, [creator(A)], 
= A€¢ vertices, by Lemma 7.3.2a. 


b. If A € top, then parent(A) = U, U € active, by Lemma 7,3.2a. 
Otherwise creator(A) = parent(A), 
=» parent(A) € @active, [parent(A)] by P3.1b, 
= parent(A) € active, by Lemma 7.3.2g. 


c. (B,A) € seq, B# A = B€ @done, [creator(A)] by P3-1c, 
= BE done, by Lemma 7.3.2g. 


2. commit A 


a. 


b. 
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P3.2a = A € @active, [A], 
= A € active, by Lemma 7.3.2g. 


Let B € children,(A). A # U = B top, 
==> creator(B) = A, 

= BE€ @vertices, [A] by Lemma 7.3.2a, 
= BE @children, [AKA), 

=> B€ @done, [A] by P3.2b, 

= BE done, by Lemma 7.3.2f. 


3. abort A 


a. 


P3.3a = A € @active, [A], 
= A € active, by Lemma 7.3.2g. 


4. perform A,u 


a. 


Pp 


P3.4a = AE @active, [x], 
= A € @active, [A], 
= A € active, by Lemma 7.3.2g. 


B € datasteps,{x) = BE @adatasteps, [xKx) by Lemma 7.3.2e, 
= B€ @visible, [xKA,x) V B € Gdead, [xKA,x) by P3.4b. 


B € @visible, [xKA,x) = B € visible,(A,x) by Lemma 7.3.2h. 
B € @dead, [xA,x) = B€ dead,{A,x) by Lemma 7.3.21. 


P3.4c = u = result(x,s), where s = <<@visible, [xKA,x); O(x)>>, 
and O = order(T). We must show s = 8s, where s' = 
<<visible (A,x); data>>. By definition, O(x) and data, are identical 
on datasteps(x), so it suffices to show @visible,[xKA,x) = 


visible (A,x). 


@visible, [xKA,x) € visible,(A,x) by Lemma 7.3.2h. So take B € 
visible ;{A,x), 

=> B€ @datasteps, [xKA,x) by Lemma 7.3.2e, 

= B€ G@visible, [xKA.x) V B € @dead, [xKA,x) by P3.4b. 


But B € @dead, [xKA.x) = B € dead,{A,x) (by Lemma 7.3.2i) - a 
contradiction; 
= BE @visible, [xKA»). 
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d. BE visible,{A,x) = B € @visible, [xKA,x) (asin (c) above), — 
= anc(A)M ABORTS,(A|B) = @ by P34d. &l 


Lemma 7.4.4: h is a possibilities map relative to 13. 


Proof: Follows immediately from Lemmas 7.4.1, 7.4.2, 7.4.3, and from Lemma 4.2.4.2.4. 


Theorem 7.4.5: h is a possibilities map, and 13 is invariant in L3. 


Proof: By Lemma 7.3.2, 13 is invariant relative toh. By Lemma 7.4.4, h is a possibilities map 
relative to 13. We apply Lemma 4.2.4.2.6 to conclude that h is a possibilities map, and 13 is an 


invariant. & 


Since h,. is a possibilities map which fixes T, all invariants and pair-invariants from L2 carry 
down to L3. Let J3 denote the conjunction of all pair-properties from Lemma 7.3.1. We summarize the 


invariants for L3 as follows: 


Lemma 7.4.6: 13 is invariant in L3, Ia is invariant in L3, J3 is pair-invariant in L3, and Ja is 


pair-invariant in L3. 


Proof: Invariance of 13 is shown in Theorem 7.4.5. J3 is pair-invariant in L3 by Lemma 7.3.1. 
Since h,, is a possibilities map which fixes T, and Ia is invariant for T in L2 (by Lemma 6.5.2), 
Ja is invariant for T in L3, by Lemma 4.2.4.3.5. Similarly since Ja is pair-invariant for T in L2 
(by Lemma 6.5.2), Ja is pair-invariant for T in L3, by Lemma 4.2.4.3.5. 8 


+ 121+ 


8. Value Maps -- A Model of Atomic Objects 


At Level 4 we introduce value maps as a data structure for keeping lock and version information 
about objects. A value map is a mapping from each object to a "stack of versions” for that object; each 
version is associated with an action that holds a lock on that object. This data structure corresponds 
roughly to the implementation of atomic objects as described in [Moss81]. In Moss's locking scheme, a 
lock can be held on an atomic object at each level in the action tree. This scheme constrains all holders of 
a lock on a particular object to be related. We note again that we are dealing only with mutual exclusion 
locks. Moss develops a more general locking protocol which distinguishes between read locks and write 
locks. 


We regard these value maps as an abstraction of the information which is already present in the 
local UAS’s at cach object. In this sense the value maps introduce no new information into the state. As 
we stressed in Chapter 4, the state in an cvent-state algebra is simply one convenient way of capturing 
execution histories. Value maps are a convenient abstraction of execution histories because the 


preconditions on a perform event can be stated easily in terms of value maps. 


Level 4 is no more “localized" than Level 3. The events in Level 4 are identical to those in Level 
3 (though transitions and preconditions are reformulated in terms of value maps), and the event mapping 
h,, is the identity. In particular, then, the communications events at Level 4 are still very simple, and 
they do not include enough information to allow localization of the orphan detection precondition. The 


non-local orphan detection precondition appears unchanged at Level 4. 
8.1 Level 4 Algebra 

l4= 6, 2p Fy T,) 

State Space: 

z= {<T,L,V>}, where the components are: 


T - the global state, an augmented action tree (as in L2), 
L - the local state (mapping loc to UAS) (as in L3), 
V - avalue map. 
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A value map gives a set of values for each object -- one value for each action which “holds a lock" (and 
thus a version) on that object. 


V: obj X act — values U {1} 
(where WA € act, x € obj, V(x,A) € values(x) U {_1 }). 


Define V(x) = {A € act: V(x,A) # _L}. (V(x) represents the actions which hold locks on object x.) 
If V(x) forms an ancestor chain, then define 


V(x).holder = the lowest (in anc-order) element of (V(x)), ie., 
V(x).holder € V(x), and WB € V(x), V(x).holder € desc(B). 


(If V(x) does not form an ancestor chain, then V(x).holder is undefined. For reachable states in LA, V(x) 


will always form an ancestor chain (see below).) 


If V(x).holder is defined, then define (x).value = V(x,V(x).holder). V(x).value denotes the “current” 


value of object x which will be scen by any datastep accessing x. 
o,= <T oreo Vo?» where 


Vx € obj, V,(x.U) = init(x), 
VA) = L, VA#U. 


_ Events: 
& 4 = 8, (The sets of events are identical in Levels 3 and 4, although preconditions and transitions differ.) 
I siti R | ti 


Lete € 6, <T.LV> € 2, <T,L,We = <TLLILVD. 


1. create A (A € act - {U}) 


PRECONDITIONS: 
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aA€ @vertices, fereator(A)) 

b. parent(A) € @active, [creator(A)] 

c. (B,A) € seq, B# A = B€ @done, {creator(A)] 
TRANSITIONS: 

a. vertices, + vertices; U {A} 

b. status,,(A) + ‘active’ 

c. @vertices, ,[creator(A)] — @vertices, [creator(A)] U {A} 


d. @status, ,[creator(A)(A) -- ‘active’ 


2. commit A (A € act - {U} - accesses) 
PRECONDITIONS: 
a. A € @active, [A] 
b. @children, [AKA) C @done, [A] 
TRANSITIONS: 
a. status, ,(A) ‘committed’ 


b. @status, ,[AKA) + committed’ 


3. abort A (A € act - {U}) 
PRECONDITIONS: 
a. A € @active, [A] 
TRANSITIONS: 
a. status,,(A) + ‘aborted’ 


b. @status, ,[AKA) + ’aborted’ 


4. perform A.u (A € accesses(x), u € values(x)) 
PRECONDITIONS: 
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a. A € @active, [x] — 

b. A € prop-desc(V(x).holder) 

c. u = V(x).value 

d. anc(A) M @aborted, [x] = 

e. B € @visible, [xA,x) = anc(A)M ABORTS, (AB) = 
TRANSITIONS: 

a. status;,(A) « ‘committed’ 

b. @status, [xA) — ’committed’ 

c. label,,(A) —u 

d. data,, + data, U {(B,A): B € datasteps,(x)} U {(A,A)} 


e. V1(x,parent(A)) + update(A Xu) 


5. @createfa] A (A € act - {U}, a € loc) 
PRECONDITIONS: 
a. A € @vertices, [8], for some B € loc 
TRANSITIONS: 
a. @vertices, [a] — @vertices, [a] U {A} 


b. A € @vertices, [a] = @status, ,[aKA) + ‘active’ 


6. @commitla] A (A € act - {U}, a E loc) 
PRECONDITIONS: 
a. A € @committed, [8], for some B € loc 
TRANSITIONS: 


a. @vertices, [a] + @vertices, [a] U {A} 
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b. @status, ,[aKA) = committed’ 


c. a € obj, Via,A)¥ 1 = 
V1(a,A) — L 
V1(a,parent(A)) — V(a,A) 


7. @abortla] A (A € act - {U}, a € loc) 
PRECONDITIONS: 
a. A € @aborted, [A], for some B € loc 
TRANSITIONS: 
a. @vertices, fa] — @vertices, [a] U {A} 
b. @status, [aKA) + ‘aborted’ 


c. a € obj, BE desc{A) = 
V1(a,B) — Lb 


Local create, commit, and abort events are identical in L4 and L3. The preconditions on 
perform events are given in terms of the value map. Note that we include a “local orphan detection” 
precondition (P4.4d): this condition is necessary for the value map to hold the proper versions, but it is 
not sufficient to detect all harmful orphans. Thus we retain the non-local orphan detection precondition 
(P4.4e). 


The effect of a perform event is to update the “current” version. A lock on the current version is - 
held by the parent of the datastcp immediately after the perform event. The value map is updated by 
commit and abort messages: a commit message for an action releases a lock held by that action to its 
parent (and the parent inherits its child’s version). An abort message for an action releases all Jocks held 
by descendants of that action (and the versions are discarded). 
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8.2 Specification of Mapping h,, 


We define a (single-state) mapping from LA to L3, h,,: L4 -» L3. (We abbreviate “hy,” as “h" 
in this chapter.) 


State Mapping 
h: 2, + &, is defined by h(<T,L,V>) = <T,L> V<T,L,V> € 3. Thus h fixes <T,L>. 
Event Mapping 


h:6,—> 8; is the identity mapping on events, ie. h(e)=e Ve€ &,. 


8.3 Level 4 Invariants 


The following invariants relate the information in the value map, V, to the local data structure, 
L. We show these invariants relative to mapping h: Since h fixes <T,L>, all invariants and pair-invariants 
for <T,L> in L3 can be applied to the proofs (by Lemma 4.2.4.3.6). (Recall that we have shown in Lemma 
7.4.6 that 13 and Ia are invariant in L3, and J3 and Ja are pair-invariant in L3.) 
Lemma 8.3.1: Let <T,L,V> € %,. Then the following are invariant relative to h: 


(Wx € obj) (let M = L(x), and let O = order(T)): 


a. BE V(x) = Bis liveinM 


Ss 


. V(x) forms an ancestor chain 


V(x) MN accesses = B 


d. BE datasteps, (x) = 
Bis dead inM V 3B’ € anc(B) M V(x): BE visible, (B’) 


9 


fo) 


. BE V(x), v-prop-desc, (B) M V(x) = Gm 
V(x,B) = result(x,s), wheres = <<visible, (B,x); O(x)>> 


f. H = V(x).holder, B € desc(H), B is live in M 
= visible, ,(B,x) = visible, ,(H,x) 


*12]* 


g. BE V(x), C € datasteps,, =» exactly one of following holds: 
1, C€ visible, ,(B) 
2. C € dead, ,(B) 


3. JC € prop-desc({B) N prop-anc(C) N V(x): 
(CE visible, ,(C’)) A (C € visible,,(B)) 


Proof: We show below that (f) and (g) follow from (a) - (e). It is trivial to show that (a) - (e) 
are O-invariant (i.e. that they hold for o,). The proofs that (b) and (c) are invariant relative to 


h are straightforward; we will argue (a), (d), and (e). 


For the induction step, let <T,L,V> € Ke, a) PRE,(e), and assume that (a) - (g) hold for 
<T,L,V>. Let <T.L',V> = <T,L,V>e. Let O = order{T), 0’ = order(T), M = L(x), M’ = 
L(x). We must show that (a), (d), and (e) hold for <T’,L’,V’>. By Lemma 4.2.4.3.6, we can 
assume that <T,L,V> and <T’,L’,V> satisfy any invariants from 13 or la, and we can assume 


that (<T,L,V>.<T’,L’, V’>) satisfy any pair-invariants from J3 or Ja. 


Since properties (a), (d), and (e) depend only on V(x), committed,,, aborted,,, and O(x), we 
need only consider events, e, which modify these components. By inspection, these events are 
{abort A, perform A,u: A € accesses(x)} U {@commit[x] A, @abort{x] A}. 


1. abort A, A € accesses(x) 


aborted,,. = aborted, U {A}. 
committed, ,. = committed, ,. 
V=V,0'=0. 


a. BE V(x) = BE V(x), = Bis live in M (by (a)). 
But by (c), V(x) M accesses = S, 
=» B ¢ accesses, =» anc(B)M {A} = @, =» Bis live in M’. 


d. BE datasteps,,{x) =» B € datasteps, ,(x), 
= Bis dead in M V 3B’ € anc(B) M V(x): B € visible,,(B'). But 
Bis dead in M = Bis dead in M’ by 7.3.1b. 
B € anc({B) MN V(x) = B’ € anc(B) N V(x), and 
B € visible,,(B’) = B € visibie,,(B’), by 7.3.1c. 
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e. Immediate since all components unchanged. | 


2. perform A,u, A € accesses(x) 


O°(x) = O(x) U {(B,A): B € datasteps, ,(x)} U {(A,A)}. 
V’(x,parent(A)) = update(A)V(x).value). 

V’(x,B) = V(x,B) VB # parent{A). 

V'(x) = V(x) U {parent(A)}. 

committed,,. = committed,, U {A}. 

aborted,,. = aborted,, (thus live in M = live in M’). . 


Note that A € prop-desc(V(x).holder), by P4.4b, and A is live in M, by P4.4d. 


a. B€ V(x) = B€ V(x) V B = parent(A). 
If B € V(x), then B is live in M, so B is live in M’. 
If B = parent(A), then anc(B) € anc(A), and A is live in M. Thus B 
is live in M, and B is live in M’. 


d. B€ datasteps,,.(x) = B € datasteps,,(x) V B= A 
If B € datasteps,,(x), then B is dead in M V 4B’ € anc(B) N V(x): 
B € visible, ,(B’). 


If B is dead in M, then B is dead in M’ by 7.3.1b. 
If B’ € anc(B) M V(x), then B’ € anc(B) N V(x), and BE 
visible,,(B’) = B € visible,,(B') by 7.3.1c. 


If B = A, then take B’ = parent(A), because parent(A) € anc(A) M 
V'(x), and A € visible, (parent(A)). 


e. BE V(x), v-prop-desc, ,.(B) N V(x) = @. 
BE V(x) = BE V(x) V B= parent(A). 


Case 1: B € V(x). 

v-prop-desc, ,(B) € v-prop-desc, ,.(B), and V(x) C V'(x) 

=> v-prop-desc, (B) M V(x) = &. 

Thus V(x,B) = V(x,B) = result(x,s), wheres = <<visible,,(B,x); 
O(x)>>. 


We must show V'(x,B) = result(x,s’), where s’ = <<visible,,(B,x); 
O'(x)>>. Since O(x) € O'(x), it suffices to show visible,,(B,x) = 
visible, ,(B,x). Since A is the only action whose status changes from 
M to M’, visible,,{B,x) = visible,,(B,x) unless A € visible,,(B,x). 
So assume A € visible, (B,x). 


Since parent(A) € desc(V(x)-holder) (by P4.4b), parent(A) € 
prop-desc(B) (since we assumed B # parent(A)). But A € 
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visible,,{B,x) =» parent(A) € v-prop-desc,,.(B) M V(x) -- a 
contradiction. 


Case 2: B = parent(A). 

If B = parent(A), then V’(x,B) = update(A)V(x).value). 

Let H =  V(x).holder. Then by definition of holder 
v-prop-desc, (H) M V(x) = @. Thus V(x,H) = result(x,s), where s 
= visible, (Hx); O(x)>>, = V°(x,B) = result(x,s’), where s’ = 
<<visible,,(H,x) U {A}; O'(x)>>. 


“We must show that visible, ,(H.x) U {A} = visible, ,(parent(A),x). 
_ Clearly visible, , (parent(A),x) = visible, (parent(A),x) U {A}, so we 
show visible, ,(H,x) = visible, (parent(A),x). 


But A € prop-desc(H) = parent(A) € desc(H), and A live in M = 
parent(A) live in M. Thus visible, ,(H,x) — visible, (parent(A),x), 


by (f). 


3. @commit{x] A 
There are two cases: 


(1) If V(x,A) # _L, then 
V(x) = V(x)- {A} U {parent(A)}, 
V(x,A) = 1, 
V'(x,parent(A)) = V(x,A). 

(2) If V(x,A) = _L, then V(x) = V(x). 


committed,,. = committed, U {A}. 
aborted,,. = aborted,, (thus live in M = live in M’). 


datasteps, , (x) = datasteps,,(x), since A € accesses(x) = A € 
@committed, [8], for some B, by P4.6a, =» A € committed, by Lemma 7.3.26, 
= AE @commitied,,, by Lemma 7.3.2b. 


a. For case (1), BE V(x) = B€ V(x) = Bis live in M = Bis live 
in M’. 
For case (2), B € V(x) = B € V(x) or B = parent(A). If B € V(x) 
then the proof is identical to case (1), otherwise we know A € V(x), 
and A is live in M. It follows that B is live in M = B is live in M’. 


d. BE datasteps,,{x) = B€ datasteps, (x) 
= Bisdeadin M V 3B’ € anc(B) M V(x): B € visible,,(B’). 


If B is dead in M then B is dead in M’, so suppose that B’ € anc{B) 
NM V(x). 


@ 
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For case (1), V'(x) = V(x), = B’ € anc(B) NM V(x), and BE 
visible,,(B’) = B € visible,,(B’). 


For case (2), A € V(x), and V'(x) = V(x)- {A} U {parent(A)}. 
If B’ + A, then B’ € anc{B) N V(x) and BE visible, (B’) as above. 


If B’ = A, then B € visible, (A), and A € visible, (parent(A)) (since 
AE committed, ,.. 
Thus B€ visible, .(parent(A)), and parent(A) € anc(B) MN V’(x). 


B € V(x), v-prop-desc, ,.(B) N V(x) = B 


Case 1: A € V(x), =» V(x) = V(x) - {A} U {parent(A)}. Thus B 
#A, 


Case la: B # parent(A). 
= B€ V(x). But v-prop-desc,,(B) € v-prop-desc, ,.(B), and V(x) - 
{A} € V(x). Thus (V(x) - {A}) N v-prop-desc,,(B) = 2. 


But if A € v-prop-desc,,(B) and B # parent(A), then parent(A) € 
v-prop-desc,,(B) M V'(x) -- a contradiction. Thus v-prop-desc,,(B) 
N V(X) = B. 


Thus V(x,B) = V’(x,B) = result(x,s), where s = <<visible, (Bx); 
O(x)>>. 

We show that visible, ,(B,x) = visible, ,.(B,x). Clearly visible, ,(B,x) 
C visible,,(B,x). Let D € visible,,.(B,x) - visible,,(B,x); we show 
that the existence of D leads to a contradiction. 


We apply (g) to D and B: We cannot have D € visible, ,(B) by our 
assumption. If D € dead,,(B), then D € visible,,(B) -- a 
contradiction. Thus we are left with the third case: 3D’ € 
prop-desc(B) MN prop-anc(D) M V(x): (D € visible,,(D')) A (D’ € 
visible, ,(B))). 


D € visible,,.(B) =» D' € v-prop-desc,,.(B). But if D’ # A, then D’ 
€ V(x) MN v-prop-desc,,.(B) -- a contradiction. If D’ = A, then 
parent(A) € V°(x)N v-prop-desc, ,.(B) -- acontradiction. 


Case Ib: B = parent{A). 

v-prop-desc,,(A) € v-prop-desc,,(A) € v-prop-desc,,{parent(A)), 
since A € visible, ,{parent(A)), and V(x) C V(x) U {A}. Thus 
v-prop-desc, (A) M V(x) = &. 


Thus V(x,A) = V'(x,parent(A)) = result(x,s), where s = 
visible, (A,x); O(x)>>. But visible, ,{parent(A),x) — visible, ,(A,x) 
(since A € committed,,.), and 0'(x) = O(x). 


eA als 


Case 2: A € V(x), =» V'(x) = V(x). Thus B € V(x). 
v-prop-desc, ,(B) C v-prop-desc,,.(B), = v-prop-desc,,(B) M V(x) 
= @. Thus V(x,B) = V‘(x,B) = result(x,s), where s = 
<<visible,,(B,x); O(x)>>. We must show visible, (Bx) = 
visible,,.(B,x). Clearly visible,,(B.x) C visible,,(B,x). Let D € 
visible, ,.(B,x) - visible,,(B,x); we show that the existence of D leads 
to acontradiction. 


As in case (1a), we apply (g) to D and B; the only possible case is the 
third: 3D’ € prop-desc(B) M prop-anc(D) M V(x): (D € 
visible,,(D’)) A (D’ ¢ visible, ,(B))). 


But D € visible, ,.(B) = DE visible, ,(B) N V@- a 
contradiction. 


4. @abort{x] A 


V(x) = V(x) - desc{A). 

B€ V(x) = V’(x,B) = V(x,B). 
committed, ,. = committed,,. 
aborted,,. = aborted, U {A}. 
O'(x) = O(x). 


a. B € V(x) = B€ V(x), B € desc{A). Thus anc(B) N aborted,, = 
@, = anc(B) M aborted,,. = @, since B € desc({A) and aborted,,. 
= aborted,, U {A}. Thus B is live in M’. 


d. B € datasteps,,(x) = B € datasteps, (x) = B is dead in M V 
3B’ € anc(B) N V(x): BE visible, ,(B’). 


If B is dead in M, then B is dead in M’. 


If B’ € anc(B) M V(x), and B’ € desc{A), then B’ € anc(B)N V(x), ~ 
and B € visible,,{B’) since B € visible, ,(B’). 


If B’ € desc(A) then B € desc{A), = A € anc(B) M aborted. = 
B is dead in M’. 


e. BE V(x), v-prop-desc,,.(B) NV) = 2, 
=+ B € V(x), and B € desc{A). 


Suppose C € v-prop-desc,,(B) M V(x); then C € v-prop-desc, ,.(B) 
M V(x), =» C € desc(A) (since V(x) = V(x)-desc{A)). 

But B € desc(A), so A € prop-desc(B) N ano(C). Then C € 
visible, ,(B) -- acontradiction. 
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Thus v-prop-desc, ,(B) N Vix) = B, = V(x,B) = V’(x,B) = 
result(x,s), where s = <<visible,,(B,x); O(x)>> = <<visible,,.(B,x); 
O'(x)>>. 


Proof of (g): First we show that at least one of the three conditions must hold: 


BE V(x), C € datasteps,,. But by (d), either C is dead in M, or 3C’ € anc(C) N V(x): C € 
visible, ,(C’). 


If C is dead in M, then either C € dead, ,(B), or kca(B,C) is dead in M. But if lca(B,C) is dead 
in M, then B is dead in M, which contradicts (a). Thus we have case (g2). 


So suppose JC’ € anc(C) N V(x): C € visible, (C’). If CE visible,,(B), then CE 
visible, ,(B), which is case (gl). IfC’ € visible, ,(B), then (B,C’) € related, since V(x) forms an 
ancestor chain (by (b)). But if C’ € anc(B), then C’ € visible, ,(B). Thus C’ € prop-desc(B) M 
prop-anc(C) M V(x), which is case (g3). 


To see that only one condition can hold, it is clear that (g1) and (g2) are mutually exclusive, 
and that (g1) and (g3) are mutually exclusive. If (g3) holds, then C € visible,,(C’), and C € 
V(x). But by (a), C’ must be live in M, so C must be live in M; thus C € dead,,(B). Thus (g2) 


and (g3) are mutually exclusive. 


Proof of (f): H = V(x).holder, B € desc(H), B is live in M. We show visible,,(B,x) = 
visible,,(H,x). Since B € desc(H), it is obvious that visible, (H,x) oe visible, ,(B,x). IfB =H 
then the result is obvious, so assume B € prop-desc(H). Suppose D € visible, ,(B,x); we show 
D € visible,,(H,x). Let L = kea(B,D). 


D € visible,,(B.x) =+ D € visible, (L), = L € prop-desc(H), since D € visible, (H). 


Now we apply (g) to D and H: If (g2) holds, then D is dead to H. But since B is live in M, L 
is not dead to H; thus D must be dead to B. But DE dead,,( B) contradicts D € visible, ,(B). 


(g3) cannot hold, because by definition of holder there is no D’ € prop-desc({H) M V(x). 


Thus (g}) must hold, =» D € visible,,(H). & 
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8.4 Proof of Possibilities Map for h,, 
We now show that h is a possibilities map. Let I4 be the conjunction of all properties in Lemma 
8.3.1. We will show that h is a possibilities map relative to 14. 
Lemma 8.4.1: h preserves initial states. 


Proof: Immediate, since h(<T. og V 9?) = <T, oly: I 


Lemma 8.4.2: h preserves transitions relative to 14. 


Proof: We must show that if <T,L,V> € PRE,(e) M %, M 14, and h(<T,L,V>) € PRE, (h(e)) 
1M B,, then h(<T,L,V>e) = h(<T,L,V>)h(e). 


But h(<T,L,V>) = <T,L> and h(e) = e, so we must show the following: 


If <T,L,V> € PRE,(e) NB, OM 14, and <T.L> € PRE,(h(e)) N B,, and <T,L,VWe = 
<T1,L1,V1>, then <T1,LD = <T,De. 


It is easily verified by inspection that the effects of all events on T and L. are identical in L3 
and L4; thus h preserves transitions relative to 14. § 
Lemma 8.4.3: h preserves preconditions relative to 14. 


Proof: We must show that if <T,L,V> € PRE,(e) MN &, A 14, and h(<T,L,V>) € BR, then 
h(<T,L,V>) € PRE, (h(e)). 


Since h(<T,L,V>) = <T,L», and h(e) = e, we must show 
<T,L,V> € PRE, (e) MN %, 1 14, <T,L> € %, = <T,L € PRE,(e). 


Preservation of preconditions is easily verificd by inspection for all events other than perform, 


since preconditions are identical in L3 and LA. 


We prove preservation of preconditions for event e = perform A,u: 
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a, P4.4a = A € active, [x]. 


b. B € @datasteps, [xKx) => B is dead in L(x) V 3B € anc(B) MN V(x): BE 
@visible, [xB’), by 8.3.1d. 


If B is dead in L(x), then anc(B)N @aborted, [x] # @. But P4.4d = anc{(A)N 
@aborted, [x] = @, = anc{Ica(A,B))N @aborted, [x] =D. 


Thus anc(B) N stop desoticatA:B)) nN @aborted, [x] +ZJ= BE 
@dead, [xKA). 


If B’ € anc(B) MN V(x), then B’ € anc(V(x).holder). 

But A € prop-desc(V(x).holder) by P4.4b, = A € prop-desc(B’). 

Thus B € @visible, [xKB’) =» B € @visible, [xKA), by Lemma 2.2.3.1d. 
c. P4.4c = u = V(x).value. Let H = V(x).holder (then u = V(x,H)). 

By 8.3.le, V(x,H) = result(x,s), where s = <«@visible, [xKH,x); O(x)>>. 


But A € prop-desc(H) by P4.4b, and A is live in L(x) by P4.4d, 
= @visible, [xKH,x) = G@visible, [xKA,x), by 8.3.1f 


Thus u = result(x,s’), where s’ = «<@visible, [xKA,x); O(x)>>. 


d. B € @visibic, [xKA,x) = anc(A) N ABORTS,{A[B) = @, directly by P4.4e. 
a 


Lemma 8.4.4: h is a possibilities map relative to 14. 


Proof: Follows immediately from Lemmas 8.4.1, 8.4.2, 8.4.3, and from Lemma 4.2.4.2.4. 8 


Theorem 8.4.5: h is a possibilities map, and 14 is invariant in LA. 


Proof: By Lemma 8.3.1, I4 is invariant relative toh. By Lemma 8.4.4, h is a possibilities map 
relative to 14. We apply Lemma 4.2.4.2.6 to conclude that h is a possibilities map, and 14 is an 


invariant. & 


Since h,, is a possibilities map which fixes <T,L), all invariants and pair-invariants from L3 
carry down to LA. In the following Lemma we summarize all the known invariants and pair-invariants for 
LA: 
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Lemma 8.4.6: Ia, 13, and 14 arc invariant in 1.4, and Ja, J3 are pair-invariant in L4. 


Proof: Invariance of 14 is shown in Theorem 8.4.5. Since h,, is a possibilities map which 
fixes <T,L>, and la, 13 are invariant in 1.3, la and J3 are invariant in 14, by Lemma 4.2.4.3.5. 
Similarly since Ja, J3 are pair-invariant in 1.3, Ja and J3 are pair-invariant in L4, by Lemma 


4243.5. ft 
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9, Fully Localized Models 


At Level 5 we compictely localize all preconditions by "piggybacking” abort information on 
communications steps. This additional information flow allows us to replace the orphan detection 
precondition (P4.4c) with a local check for orphans. Other than the new abort information in 
communication steps (and the elimination the non-local orphan precondition), Level 5 is identical to 
Level 4. 


Because all preconditions are localized at Level 5, we can project out the "virtual" global state to 
define Level 6. 
9.1 Level 5 Algebra 
L5 = (8, 2, 04, 75) 
State Space: 
Z, = 2, = {<T,L,V>}, where the components are: 


T - the global state, an augmented action tree (as in L2), 
L - local UAS’s (as in L3), 
V - value maps (as in LA). 


The local steps are identical in L5 and L4, but for the communications events we introduce an 
explicit “sender” of information. (Thus communications events are now parameterized by two locations: 
the sender and the receiver.) This modification is necessary to describe precisely the set of aborts which 
must be piggybacked on a communications event. (In fact this set will be all aborts known to the sender). 


Communications events: 
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@create[B,a] A,d_ - -- send create message from £ to a with aborts d 
@commit{B,a] A,d_ -- send commit message from £ to a with aborts d 
@abort[B,a] A -- send abort message from B to a 


The parameter "d" of create and commit messages models the DONE lists in remote invocation and 


commit messages. 


As in previous levels, the transition relation will be defined so that each communications event is 
idempotent. 
Transition Relation 


Let e € &,, <T,L,V> € X,, <T,L,V>e = <T1LLVD. 


1. create A (A € act - {U}) 
PRECONDITIONS: 
a. A € @vertices, [creator(A)] 
b. parent(A) € @active, [creator(A)] 
c. (BA) € seq, B# A = B€ @adone, [creator(A)] 
TRANSITIONS: 
a. vertices, + vertices, U {A} 
b. status,,(A) + ‘active’ 
c. @vertices, ,[creator(A)] ~ @vertices, [creator(A)] U {A} 


d. @status, ,[creator(A)KA) +~ ’active’ 


2. commit A (A € act - {U} - accesses) 
PRECONDITIONS: : 


a. A € @active, [A] 
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b. @children, [AKA) € @done, [A] 
TRANSITIONS: 
a. status,,(A) + ‘committed’ 


b. @status, [AKA) + ’committed’ 


3. abort A (A € act - {U}) 
PRECONDITIONS: 
a. A € @active, [A] 
TRANSITIONS: 
a. status,,(A) + ‘aborted’ 


b. @status, [AKA) + ‘aborted’ 


4. perform A.u (A € accesses(x), u € values(x)) 
PRECONDITIONS: 
a. A € @active, [x] 
b. A € prop-desc(V(x).holder) 
c. u = V(x).value 
d. anc(A) M @aborted, [x] = B 
TRANSITIONS: 
a. status;,(A) ‘committed’ 
b. @status, ,[xKA) + ‘committed’ 
Cc. label,,(A) —u 
d. data,, + data, U {(B,A): B € datasteps,(x)} U {(A,A)} 


e. V1(x,parent(A)) — update(AXu) 
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5. @create[B.a] Ad (A € act - {U}, B,a € loc, d € act) 
PRECONDITIONS: 
a. A € @active, [8] 
b. d = @aborted, [8] 
TRANSITIONS: 
a. @vertices, ,[a] + @vertices, [a] U {A} 
bA€ @vertices, [a] = @status, ,[a(A) + ‘active’ 
c. @aborted, ,[a] -- @aborted, [a] U d 
d. a € obj, DE d, BE desc(D) = 
V1(a,B) — _L 
6. @commit[B.a] Ad (A € act - {U}, Ba E loc, d € act) 
PRECONDITIONS: 
a. A € @committed, [8] 
b. d = @aborted, [8] 
TRANSITIONS: 
a. @vertices, [a] — @vertices, [a] U {A} 
b. @status, [aK A) + ’committed’ 
c. a € obj, Via,A)# 1 = 
V1(a,A) — _L 
V1(a,parent(A)) — V(a,A) 
d. @aborted, ,{a] ~- @aborted, [a] U d 
e. a € obj, DE d, BE desc(D) = | 
V1(a,B) — _L 
7. @abortlB.a] A (A € act - {U}, Ba € loc) 
PRECONDITIONS: 
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a. A € @aborted, [8] 

TRANSITIONS: 
a. @vertices, [a] -- @vertices, [a] U {A} 
b. @status, ,[aA) — ‘aborted’ 


c. a € obj, BE desc(A) = 
V1(a,B) — _L 


The preconditions and transitions for all local events are identical in L5 and L4 (except that the 
non-local orphan detection precondition, P4.4e, is eliminated at Level 5). Communications events are 
fundamentally different at Level 5, since orphan detection information is explicitly passed between 


locations with create and commit messages. 


The orphan information that we include with create and commit messages is quite simple: a 
sending location must piggyback all the aborts it knows about onto these messages. These messages now 
correspond closely to the create and commit messages of the simplified orphan detection algorithm that 
we presented in Chapter 1. The "known aborts sect" in these messages models the “DONE” list in the 
messages of this algorithm. While we show below that this information is sufficient (because there is a 
possibilities map from L5 to L4), other choices are possible. As a simple example, we conjecture that it is 
only necessary to send a covering subset of the known aborts in create and commit messages, because such 
a subset captures the same information about potential orphans. We have not attempted to take such 
optimizations into account, and we have focused on simplicity of description for our model. In general, at 
every level of our algebra hierarchy we make additional choices about the details of our model, and we 


further restrict the possible implementations which fit this model. 


In our Level 5 model we do not piggyback the known aborts set onto abor! messages. We can 
explain the difference between abort messages and create or commit messages by recalling (from Chapter . 
3) that in our idealized transaction system, aborts transfer no information (other than the fact that the 
abort occurred). " Because the receiver of an abort message docs not “learn anything” about the execution 
history, the sender need not tell the receiver about all potential orphans. Internal consistency is achieved 
by coupling the flow of normal information with the flow of orphan information. (In this case “orphan 


information” is just the set of known aborts at the sender.) Of course, it would not Aurt to piggyback 
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known aborts onto abort messages, and this additional information might allow some orphans to be 


detected sooner. 


9.2 Specification of Mapping h., 


We define a (single-state) mapping from L5 to LA, hy,: L5 > L4. (We abbreviate “hy,” as "h" 
in this chapter.) 


State Mapping 
h: 2, ~ x, is the identity mapping: h(<T,L,V>) =<T,L,V> V<T,L,V> € Zs Thus h fixes <T,L,V>. 


Event Mapping 


h: &, aad 6, is defined as follows. Let ord4 be an arbitrary total order on & 4 and let aborts-in(d) = 
{@abort{a] D: D € d}. 


h: create A — create A 
commitA — commitA 
abort A — abortA 


perform A,u -> perform A,u 


@create[B,a] Ad -* @creatc[a] A * <<aborts-in(d); ord4#>> 
@commit{B,a] A,d —> @commit{a] A ° <<aborts-in(d); ord4>> 
@abort[B.a]A  — @abortfa] A 


Note that we map communications events @create and @commit into a sequence of events at Level 4. 
This sequence first creates or commits the primary action in the message, and then effectively aborts all 
actions in the aborts list, d. We will show that the order in which these abort events occur is unimportant; 
thus we let ord4 be an arbitrary order. 
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9.3 Level 5 Invariants 


Before stating the Level 5 invariants, we state two preliminary lemmas which will be used 


below: 


Lemma 9.3.1: Let <T,L,V> € %, M PRE,(e), and <T’,L’,V’> = <T,L,V>e. Suppose that 
<T,L,V> satisfies 13, and (<T,L,V>,<T,L’,V’>) satisfies Ja and J3. If ABORTS,{A) < 
@aborted, [a], and A € @committed, [a], then 


ABORTS,(A) < @aborted, [al]. 


Proof: If A € @committed, [a], then by Lemma 7.3.2f, A € committed, 
= ABORTS,.(A) < ABORTS,(A), by Lemma 6.3.3.3. 


But @aborted, [a] € @aborted, [a], by Lemma 7.3.la = @aborted, [a] < @aborted, fal, . 
by Lemma 2.2.1.la. 


And ABORTS,(A) < @aborted, [a], by hypothesis. 
Thus ABORTS,(A) < @aborted, [a], by transitivity of <. 8 
Lemma 9.3.2: Let <T,L,V> € BR, ‘a PRE,(c), and <T’.,L’,V> = <T,L,V>e. Suppose that 


<T,L,V> satisfies 13, and (<T,L,V>,<T’.L’,V’>) satisfies Ja and J3. If SEQ-ABORTS, (A) < 
@aborted, [a]. and A € @active, [a], then 


SEQ-ABORTS,.(A) < @aborted, fa]. 


Proof: If A € @active, [a], then by Lemma 7.3.2f, A € vertices, 
= SEQ-ABORTS,(A) < SEQ-ABORTS,(A), by Lemma 6.3.3.4. 


But @aborted, [a] C @aborted, {a], by Lemma 7.3.1a =» @aborted, [a] < @aborted, [a], 
by Lemma 2.2.].1a. 


And SEQ-ABORTS,(A) ¢ @aborted, [a], by hypothesis. 


Thus SEQ-ABORTS,.(A) < @aborted, Ja], by transitivity of <. 8 
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The following invariants are our key result for Level 5. They express the fact that the local states 
have the proper abort information at all times. We show these invariants relative to mapping h: Since h 
fixes <T,L,V>, all invariants and pair-invariants in L4 can be applied to the proofs (by Lemma 4.2.4.3.6). 
(Recall that we have shown in Lemma 8.4.6 that 14, 13, and Ia are invariant in L4, and J3 and Ja are 


pair-invariant in LA.) 
Lemma 9.3.3: Let <T,L,V> € %,. Then the following are invariant relative to h: 
(Wa E loc) 


a. A € @committed, [a] = ABORTS,(A) < @aborted, [a] 


b. A € @active, [a] = SEQ-ABORTS,(A) < @aborted, [a] 


Proof: It is trivial to show that (a),(b) are O-invariant (ie. that they hold for o,): (a) holds 
vacuously, and for (b) only U € @vertices, fa}, but SHO AGORTS = Z. 


For the induction step, let <T,L,V> € Te, nN PRE,(e), and assume that (a),(b) hold for 
<T.L,V>. Let <T’,L’,V> = <T.L,V>e. We must show that (a),(b) hold for <T’,L’,V>. By 
Lemma 4.2.4.3.6, we can assume that <T,L,V> and <T’,L’,V’> satisfy any invariants from 14, 13 
or Ia, and we can assume that (<T,L,V>,<T’,L’, V’>) satisfy any pair-invariants from J3 or Ja. 


Using the Induction Hypothesis, and invariants 13, J3, Ja, we can apply Lemmas 9.3.1 and 
9.3.2 to conclude that 


A € @committed, [a] = ABORTS,.(A) < @aborted, [a]. 
A € @active, [a] = SEQ-ABORTS,(A) < @aborted, [a]. 


Thus we need only show that (a) holds (respectively, (b) holds) for <T’,L’,V> where A € 
@committed, [a] - @committed, [a] (respectively, A € @active, {a] - @active, [a]}). We 


consider all possible events, e, for these remaining cases: 


1. create A (note that A # U) 


@committed, [a] = @committed, [a]. 
@active, {a] - @active, [a] = {A}, for a = creator(A). 
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@active, [a] = @active, [a], for a # creator(A). 
a. Holds vacuously. 


b. (We need only consider a = creator(A).) 
By P5.1b, parent(A) € @active, [a], 
= SEQ-ABORTS,(parent(A)) < @aborted, [a]. 


By P5.1c, (B,A) € seq, B# A = B € @done, [a]. Thus B € 
v-seq,(A) = B € Gcommitted, [a], 
= ABORTS,(B) < @aborted, fa]. 


B € i-seq,(A) = B€ @aborted, [a] = B € @aborted, [a], 
= i-seq,(A) < @aborted, fa]. 


Thus SEQ-ABORTS...(parent(A)) U ABOR’ B 
Q- P ) seus TS 4B) 
U i-seq{A) < @aborted, [a], by Lemma 2.2.1.1c. 


Thus SEQ-ABORTS,(A) < @aborted, [a], by Lemma 6.2.1.4. 


2. commit A (note A € accesses) 


@committed, [a] - @committed, [a] = {A}, for a = A. 
@committed, {a] = @committed, [a], for a # A. 
@active, [a] € @active, [a]. 


a. (We need only consider a = A.) 
ABOR A) =i- des (A) U ABOR B). 
TS,(A) = i-precedes (A) sel TS (8) 
Since A € accesses, v-data-anc,.(A) = i-data-anc;(A) = @; thus 
v-precedes,(A) = v-ane-seq,{A) U v-child,.(A), and 
i-precedes,.(A) = i-anc-seq;(A) U i-child(A). 


Thus ABORTS,(A) = i-anc-seq,(A) U By geste i U 
vane 


i-child (A) U ee : 

= SEQ-ABORTS,.(A) U i-child,.(A) U Age hr pag 

But A € @active,[A] by PS.2a, = SEQABORTS,(A) < 
@aborted, [A], by Lemma 9.3.2. 


IFBE children,(A), then B € children,{A). But by Lemma 7.3.2a, 
B € children(A) = B € @vertices, [A]. By P5.2b, 
@children, [AKA) © @done, [A]. From Lemma 7.3.2f, it follows 
that v-child,(A) C @committed, [A], and i-child,(A) C 
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@aborted, [A]. 


Thus B € v-child,.(A) = ABORTS,(B) < @aborted, [A], by 
Lemma 9.3.1. 


BE i-child,.(A) = BE @aborted, [A] = BE @aborted, {A], by 
Lemma 7.3.la. 


Thus we have shown 
SEQ-A BORTS.(A) < @aborted, [A], 
i-child,.(A) < < @aborted, . [A], and 


ABORTS,(B) < @aborted, {A]. 
vchild (A) 


See < @aborted, [A] follows directly from Lemma 
2.2.1.1c. 


b. Holds vacuously. 


3. abortA 


@committed, [a] = @committed, [a] 
@active, [a] © @active, [a] 


a. Holds vacuously. 


b. Holds vacuously. 


4. perform A,u 


@committed, [a] - @committed, [a] = {A}, fora = x (x = object(A)) 
@committed, [a] = @committed, [a], fora # x 
@active, {a] C @active, [a] 


a. (We need only consider a = x) 
ABOR A) = i-precedes,(A) U ABOR B) . 
TS,{A) = i-precedes,(A) att 8) 


Since A € accesses, v-child,.(A) = i-child (A) = @; thus 
v-precedes,.(A) = v-anc-seq,-{A) U v-data-anc,.(A), and 
i-precedes,(A) = i-anc-seq,(A) U i-data-anc,(A). 
Thus ABORTS,.(A) = i-ane-seq,(A) U p@ ABORTS,(B) U 

- ABOR 
i-data-anc,(A) U abJABol {B) . 


= SEQ-ABORTS,.(A) U eee U alg ABOR ne 
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But A € @active,[x] by P5.4a, = SEQ-ABORTS,(A) < 
@aborted, {x], by Lemma 9.3.2. 


If (B,A) € v-data,., then B € @visible, {xKA), by Lemma 7.3.2), = 
B € @visible, [xKA). 

Thus AJB € @committed, [x] = ABORTS,(AJB) < 
@aborted, [x], by Lemma 9.3.1. 


If (BA) € i-data,., then B € @dead, [x], by Lemma 7.3.2k. But B € 
@dead, [x] = {crucial,.(B)} < @aborted, [x], by Lemma 7.3.21. 


Thus we have shown 
SEQ-ABORTS,.(A) < @aborted, [x], 
i-data-anc,(A) < @aborted, [x], and 
ABORTS..(B) < @aborted, [x]. 
v-data- ancy tA) 
ons ne < @aboried, [x] follows directly from Lemma 
2.2.1.1¢. 


b. Holds vacuously. 


5. @create[B,a] A,d 


@commited, [y] = @committed, [y]. 
@vertices, [y] = @vertices, [y] U {A}, for y = a@ (unchanged for ail other 


locations). 
Gactive, [a] - @active, [a] € {A} (might be 9). 
@active, [y] = @active, [y], for y # a. 


a. Holds vacuously. 

b. (We need only consider y = a.) 
By P5.5a, A € @active, [8]: thus SEQ-ABORTS{(A) < 
@aborted, [8], by Lemma 9.3.1. 


But d = @aborted, [8] by P5.5b, and dC @aborted, [a] (by 
T5.5c). Thus SEQ-ABORTS,(A) < @aborted, fal}. 


But A € @active, [8B] = A € vertices, by Lemma 7.3.2f, = 
SEQ-ABORTS,.(A) < SEQ-ABORTS,{A), by Lemma 6.3.3.4. 


Thus SEQ-ABORTS,(A) < @aborted, fa], by transitivity of <. 
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6. @commit[B,a] A,d 


@committed, [a] - @committed, [a] € {A}. 
@committed, [y] = @committed, fy], for y # a. 
@active, Ly] € @active, [y]. 


a. (We need only consider y = a.) 


By P5.6a, A € @committed, [8]; thus ABORTS,(A) <$ 
@aborted, [8], by Lemma 9.3.1. 


But d = @aborted,(8] by P5.6b, and dC @aborted, [a] (by 
T5.5d). Thus ABORTS,(A) < @aborted, fa]. 


But A € @committed, [8B] = A € committed, by Lemma 7.3.2f, 
= ABORTS,(A) < ABORTS,{A), by Lemma 6.3.3.3. 


Thus ABORTS,.(A) < @aborted, [a], by transitivity of <. 


b. Holds vacuously. 


7. @abort[B,a] A 


@committed, [y] = @commited, [y]. 
@active, {y] C @active, [y]. 


a. Holds vacuously. 


b. Holds vacuously. | 


9.4 Proof of Possibilities Map for h., 


We now show that h is a possibilities map. Let IS be the conjunction of all properties in Lemma 
9.3.3. We will show that h is a possibilities map relative to 15. 


Lemma 9.4.1: h preserves initial states. 


Proof: Immediate since h(<T. oy Vo?) = <T, oy Vo? | 


Lemma 9.4.2: h preserves transitions relative to 15. 
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Proof: We must show that if <T,L,V> € PRE,(c) N %, M15, and h(<T,L,V>) € PRE, (h(e)) 
M%,, then h(CT,L,V>e) = h(<T,L,V>)h(e). 


But h(<T,L,V>) = <T,L,V>, so we must show the following: 


Let <T,L,V> € PRE,(e) N %, M PRE,(h(e)) M I, 
Let <T,L,V>e = <T1L1,VI> (in L5), <12,L2,V2> = <T,L,V>h(e) (in L4). 
Then <TLL1,VD> = <T2,L2,V2 


For the local steps (create, commit, abort, perform), h(e) = e, and it is easily verified by 
inspection that the effects of these events on T, L, and V are identical in LS and L4. It is also 
easily verified by inspection that the effect of @abort[B,a] A is identical to the effect of 
@abort[a] A. 


For communications events @create and @commit, transition steps T5.5a,b, and T5.6a,b,c 
are identical to transition effects 1T4.5a,b, and T4,5a,b,c, respectively. Transition steps 
T5.5c.d, and T5.6d,e, respectively, accomplish the same effect as the sequence of aborts 
<<aborts-in(d); ord#>>: Adding all aborts in d to @aborted, [a] (Level 5) has the same effect 
as adding them individually (Level 4). To see that updating of value maps is also preserved, 
note that an abort at an object removes all descendants of the aborted action from the value 
map at that object. But individually removing descendants of each action (Level 4) has the 
same effect as removing all descendants from actions in d at once (Level 5). Note that this 
removal of descendants is clearly commutative, and thus the order of abort steps in 


aborts-in(d) makes no difference. §& 


Lemma 9.4.3: h preserves preconditions relative to 15. 


Proof: We must show that if <T,L,V> € PRE,(e) N &, N15, and h(<T,L,V>) € &,, then 
h(<T,L,V>) € PRE, (h(e)). 


Since h(<T,L,V>) = <T.L,V>, we show 
<T,L,.VD €E PRE,(e) Nn BR, NISNR ,= <T,L,V> € PRE, (h(e)). 


Preservation of preconditions is easily verified by inspection for all local steps other than 


perform, since preconditions are identical in LS and L4. We prove preservation of 
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preconditions for event- e = perform A,u, and for the communications steps: 


1. perform A,u 
a. P5.4a «= P44a. 
b. P5.4b <= P4.4b. 
c. P5.4c = P4.4c. 
d. P5.4d «= P4.4d. 


e. B € @visible, [xKA,x) = A]B € @committed, [x] 
= ABORTS,(A|B) < @aborted, [x] by Lemma 9.3.3a. 


But by P5.4d, anc(A) M @aborted, [x] = @. Thus anc(A) M 
ABORTS,(A]B) = , by Lemma 2.2.1.1d. 


2. e = @create[B,a] A,d 
h(e) = @create[a] A * <<aborts-in(d); ord4>> 


First we show that <T,L,V> € PRE,(@createfa] A): 


a. P5.5a = A € @active,[B], = A € @vertices, [8], which 
' automatically satisfies P4.5a. (P4.5a requires that there be some B for 
which A € @vertices, [8}.) : 


Now let e’ be the prefix of h(e) preceding @abort[a] D (where D € d), and let 
<TLL1,VD = <T,L,V>e’ (in L4). We show that <T1,L1,VD € PRE,(@abort[a} 


D): 


a. D€d = D€ @aborted, [8] = D € aborted, by Lemma 7.3.2f, 
= D € @aboried, [D] by Lemma 7.3.2c, 
= DE @aborted, ,[D] by Lemma 7.3.la. 


3. @commit[Z,a] A,d e = @commit[Z,a] A,d 
h(e) = @commit{a] A * <<aborts-in(d); ord4>> 
First we show that <T,L,V> € PRE,(@commit[a] A): 
a. P5.6a = A € @committed, [8]. which satisfies P4.6a. 


Now let e’ be the prefix of h(e) preceding @abort{a] D (where D € d), and let 
<T1L1,VD = <T.L,V>e’ (in L4). We show that <T1,L1,Vl> € PRE,(@abort{a] 
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D): 


a. DEd = DE @aboricd, [8] = D€ aborted, by Lemma 7.3.26, 
= D€ @aborted, [D] by Lemma 7.3.2c, 
= D € @aborted, ,[D] by Lemma 7.3.la. 


4. @abort[B,a] A 


a. P5.7a =» A € @aborted, [8], which satisfies P4.7a. 


Lemma 9.4.4: h is a possibilities map relative to IS. 


Proof: Follows immediately from Lemmas 9.4.1, 9.4.2, 9.4.3, and from Lemma 4.2.4.2.4. § 


Theorem 9.4.5: h is a possibilities map, and [5 is invariant in LS. 


Proof: By Lemma 9.3.3, 15 is invariant relative to h. By Lemma 9.4.4, h is a possibilities map 
relative to 15. We apply Lemma 4.2.4.2.6 to conclude that h is a possibilities map, and 15 is an 


invariant. § 


Since h,, is a possibilities map which fixes <T,L,V>, all invariants and pair-invariants from L4 
carry down to LS. We summarize the invariants for LS as follows: 


Lemma 9.4.6: la, 13, 14, and I5 are invariant in L4, and Ja, J3 are pair-invariant in LA. 


Proof: Invariance of I5 is shown in Theorem 9.4.5. Since hy, is a possibilities map which — 
fixes <T,L,V>, and Ia, 13, and 14 are invariant in L3, Ia, 13, and 14 are invariant in LA, by 
Lemma 4.2.4.3.5. Similarly since Ja, J3 are pair-invariant in LA, Ja and J3 are pair-invariant 
in L5, by Lemma 4.2.4.3.5. 8 
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9.5 Level 6 Algebra and Mapping hy, 


At Level 6 we remove the global action tree, T. Since we have localized all preconditions in 


Level 5, the global tree can now be properly regarded as a “virtual” component of the state. 
L6 = (6, 2, 06, 76) 
2 = {<L,V>}, where the components are: 


L - local UAS’s {as in L3) 
V - value maps (as in LA) 


o,= <Lp, VY 


Ly Vo - asin L4 


7, is identical to 7,, except that all transitions involving T are discarded (T5.1a,b, T5.2a, T5.3a, TS.4a,c,d). 


Let hg,: L6 —> L5 be the augmentation map from Level 6 to Level 5 (Definition 4.2.5.1). Thus hg, is the 
identity map on events, and the state mapping maps <L, V> to all possible states <T,L,V> in 2s. 


Theorem 9.5.1: hy, is a possibilities map. 


Proof: Follows immediately from Lemma 4.2.5.3. § 


By Lemma 4.2.5.2, Nes fixes <L,V>, so all invariants and pair-invariants for L and V from L5 carry down 
to L6. Most of these properties involve T, but all invariants from 14 except for 8.3.le do not involve T, 
nor do the pair-invariants J3. We summarize these invariants and pair-invariants for L6 in the following 
Lemma. Let 14 denote 14 with 8.3.le removed. (Thus I4’ is just all invariants from 14 which apply to the 
local state <L,V>.) 


Lemma 9.5.2: 14’ is invariant for L6, and J3 is pair-invariant in L6. 


Proof: Since hes is a possibilities map which fixes <L,V>, and 14’ is invariant for <L,V> in LS, 
14 is invariant in L6, by Lemma 4.2.4.3.5. Similarly since J3 is pair-invariant for <L,V> in LS, 
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J3 is pair-invariant in 16, by Lemma 4.2.4.3.5. I 
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10. Distributed System Model 


Level 7 is our lowest-level model of the transaction system. At this level we partition the system 
state among nodes, and we use a communications model which takes into account arbitrary delays in 
message delivery. This modcl is a message-based distributed event-state algcbra as described in Chapter 


4, Nodes communicate by sending and receiving messages via a message buffer. 


We require that each object and each action reside at a particular node (its "home node"). A 
node’s state consists of a UAS and a value map for each object which resides there. We can thus view 
nodes as a grouping structure for the "tree locations" from Level 6. The mapping from node states (Level 
7) to local states at tree locations (Level 6) is a straightforward “explosion” of the node states, Similarly 


the Level 6 value map can be constructed from the value maps at each node. 


The only complexity in mapping from Level 7 to Level 6 is in modeling the communications 
delays at Level 7, since the communications events at Level 6 are “instantaneous.” We resolve this 
discrepancy by treating messages themselves as locations. We regard a message as an initially empty 
“slot" for information; once this message is sent, the slot is filled. Since messages are never removed from 
the message buffer in our Level 7 model, it is natural to regard this message slot as a "location" at Level 6. 
The communications delay at Level 7 is explaincd at Level 6 by imagining that all messages are 


instantancous, but that they are sent indirectly via another location (the message slot). 
10.1 Level 7 Algebra 


L7 = 6., zy, oy 7) 


The Level 7 Algebra is a message-based algebra as defined in Chapter 4 (Definition 4.3.2.1). Let 
Nodes = {1,2,....n} name the nodes in the system, and let “buf” name the message buffer. We will use I" 
= &, in this chapter, so that we can subscript the state space without confusing these subspaces with the 
state spaces of higher levels in our algebra hierarchy. | 


We assume that each object in the system’ resides at a particular node, and each action runs at a 
single node. We call this node the home node of the action or object. Formally, 


home: tloc -» Nodes. (If A € accesses, then we will use home(A) synonymously with home(object(A)). 
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Let obj(i) = {x € obj: home(x) = i}, 
act(i) = {A € act - {U} - accesses: home(A) = i}, 
tloc(i) = obj(i) U act(i). 


State Space: 


The local state at a node consists of a UAS for the node together with a "local" value map for each object 


whose home is at that node: 
lr, = {<1,v>:1 € UAS, and_ v: obj(i) X act > values(obj) U {1 }}, where i € Nodes. 


If D € T andi € Nodes, then we denote the UAS and value map components of D.i by D.i.l and D.i.v, 
respectively. We extend the definitions of V(x), V(x).holder, V(x).value, etc., from value maps to “local 


value maps” in the obvious way. 
The set of messages is defined as follows: 


Msgs = {#create(ij) A,d: ij € Nodes, A € act - {U}, dC act} 
U {#commit(ij) A,d: ij € Nodes, A € act - {U}- accesses, d € act} 
U {#abort(ij) A: ij € Nodes, A € act - {U}} 


The message buffer space is Pout = HAMsgs). 


If D € T, and i € Nodes, then we abbreviate any function propp,, by #propyji]. (This notation is 


similar to the notation introduced for locations, but note that i is now a node rather than a location.) 
Initial State: 
a, = Dp, where D, is defined by 


D,buf = @, 
VWi€ Nodes, Dyi.l = T iif the trivial UAS, and 


Dyi.vQ.U) = init(x), Vx € obj(i), 
Dyi.vQ,A) = 1, VA # U. 


The UAS and value map components of D,,i correspond in a natural way to L, and Vp. 
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Events: 


6, consists of local events (create, commit, abort, perform), and communications events send M, 


receive M, for M € Msgs. Local events are similar to the corresponding local events at Level 6. 


At Level 7 we include a qualifier "(d)" on create, commit, and perform events. For example, a 
create event takes the form: create A(d), where dC act. The preconditions for the create, commit and 
perform events requires that “d" be the set of known aborts at the node where the event occurs. "d" does 
not enter into any transitions. We can thus regard "(d)" as recording the set of known aborts when the 
event occurs; including this qualifier does not change the semantics of the events. The qualifier “(d)” is 
useful when we construct a mapping from Level 7 to Level 6: “local” events at Level 7 will map into a 
local event at Level 6 plus a sequence of communications events at Level 6. (Conceptually in this 
mapping we regard the occurrence of an event at a node as an occurrence at a single location at that node, 
followed by a broadcast of the event (with Level 6 communications events) to all other locations at that 
node. Of course, at Level 7 no "real" communications events occur.) Because these Level 6 


communications events require a “done” list, we extract it from the "(d)” in the Level 7 event. 


(This device of qualifying events with a part of the state allows us to construct an 
event-homomorphic mapping between algebras. If the qualifier were not used, then the proper mapping 
from a lower-level event to the higher-level sequence of events would depend on the lower-level siate as 


well as on the lower-level event, ie. the event mapping would not be event-homomorphic.) 
Transition Relati 


Although Definition 4.3.1.1 describes the total transition relation of a message-based algebra in 
terms of Jocal transition relations for each component, we will not describe local transition relations 
individually. Instead we present the total transition relation. It should be clear that preconditions and 
effects are properly localized (i.e. the local transition relations could be constructed easily from our total — 
transition relation). 


Lete €6., DEF, De = DI. 


1. create A (d) (A € act - {U}, home(creator(A)) = i, d € act) 


PRECONDITIONS: 
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a. A€ # vertices, fi] 
b. parent(A) € # active Ji] 
c. (BA) € seq, B# A = BE #done,fi] 
d. d= #aborted fil 
TRANSITIONS: 
a. # vertices, [i] — # vertices, fi] U {A} 


b. #status,, [iKA) + ‘active’ 


2. commit A (d) (A € act - {U} - accesses, home(A) = i, d € act) 
PRECONDITIONS: 
a. A€ #active fi] 
b. #children,JiKA) C #done, fi] 
c. d= #aborted, i] 
TRANSITIONS: 
| a. #status, [iKA) + ‘committed’ 
b. Vx € obj(i), Div,A)4¥ 1 = 
D1.i.v(x,A) — L 
D1.i.v(x,parent(A)) + D.i.v(x,A) 
3. abort A (A € act - {U}, home{A) = i) 
PRECONDITIONS: | 
a. AE #active, fi] 
TRANSITIONS: 
a. #status,, [iKA) + ‘aborted’ 


b. Wx € obj(i), B € desc(A) = 
D1.i.v(x,B) — 
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4. perform Au (d) (A € accesses(x), u € values(x), home(x) = i, d C act) 
PRECONDITIONS: 
a. A € #active,fi] 
b. A € prop-desc(D.i.v(x).holder) 
c. u = D.i.v(x).value 
d. anc(A)MN # aborted, fi] =@ 
e. d = #aborted, Ji] | 
TRANSITIONS: 
a. #status,, [iKA) + ‘committed’ 


b. D1.i.v(x,parent(A)) + update(Au) 


5. send #create(ii) Aid (A € act - {U}, ij € Nodes, d © act) 
PRECONDITIONS: 
a A€ # active fi} 
b. d = #aborted, fi] 
TRANSITIONS: 


a. Dl.buf + D.buf U { #create(ij) A,d} 


6. receive #create(i.i) Ad (A € act - {U}, ij € Nodes, dC act) — 
PRECONDITIONS: 
a, #create(ij) Ad € D.buf 
TRANSITIONS: 
a. # vertices, [j]] + # vertices, fj] U {A} 


b Ag # vertices, fj] = #status, [jKA) + ‘active’ 
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c. #aborted,, [j] « #aborted, fj] U d 
d. Wx € obj(j), CE d, BE desc(C) = 
D1 j.v(x,B) — _L 
7. send #commit(i,j) Ad (A € act - {U}, ij € Nodes, d € act) 
PRECONDITIONS: 
a. A€ #committed, Ji] 
b. d= #aborted fi] 
TRANSITIONS: 


a. Dl.buf + D.buf U #commit(ij) A,d 


8. receive ¥commitfij) Ad (A € act - {U}, ij € Nodes, d € act) 
PRECONDITIONS: 
a. #commit(ij) A,d € D.buf 
TRANSITIONS: 
a. # vertices, [j] — # vertices, [j] U {A} 
b. #status,, [j](A) — ‘committed’ 
c. Wx € obj(j), Dj.v(x,A)# 1 = 
D1 j.v(x,A) + L 
D1.j.v(x,parent(A)) — D,j.v(x,A) 
d. #aborted,,,[j] + # aborted, fj] Ud 
e. Wx € obj: home(x) = j, CE d, BE desc(C) = 
D1 j.v(x,B) — 1 
9. send #abort(i,i) A (A € act - {U}, ij € Nodes) 
PRECONDITION: : 
a AE # aborted, Ji] 


TRANSITIONS: 
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a. Dl.buf — D.buf U #abort(i,j) 


10. receive #abort(ij) A (A € act - {U}, ij € Nodes) 
PRECONDITION: 
a. #abort(ij) A € D.buf 
TRANSITIONS: 
a. # vertices, [j] — # vertices, [j] U {A} 
b. # status, LiKA) +~ ‘aborted’ 


c. Wx € obj(j), B€ desc(A) = 
D1 j.v(x,B) — _L 


10.2 Specification of Mapping hy, 


We define a (single-state) mapping from L7 to L6, h,,: L7 > L6. (We abbreviate "h,,” as “h" 
in this chapter.) 


At this point we instantiate the (previously unspecified) set of locations, loc; we define 
loc = tloc U Msgs 


We regard a message as a location because it is a container for information. The local information at this 
location is essentially the information contained in the message. As we explained above, we imagine that 
each message is a predefined “slot” for the particular combination of information that it represents. 


Originally this slot is empty; when the message is sent, the slot is filled. 

State Mapping 

h: 2, — & is defined as follows. Let D € r. h(D) = <L,V>, then 

V = valuemap(D), where valuemap(D) is defined . {((0,a),u): D.home(o).v(o,a) = u}. 


Valuemap is defined exactly as we expect: the “total” valuemap for Level 6 is constructed by combining 
all local value maps. This mapping is so trivial that we can almost regard it as a simple change in 
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notation. 
L is defined by: 


1. Ifa € tloc, then L{a) = D.-home(a).1 
2. If a € Msgs, and a € D.buf, then L(a) = T,,. 
3. If a € Msgs, and a € D.buf, then 


a. Ifa = #create(ij) A,d, then L(a) = T, where 
vertices, = {U,A} Ud 
committed, = @ 
aborted, =d 


b. Ifa = #commit(ij) A,d, then L({a) = T, where 
vertices, = {U,A} Ud 
committed, = {A} 
aborted, = d 


c. Ifa = #abort(ij) A, then L{a) = T, where 
vertices, = {U,A} 
committed, = 
aborted, = {A} 


If a € tloc, then L(a) is just the UAS at a’s home node. For locations which are messages, if the message 
has not been sent then its location has “no information” (i.e. its UAS is the trivial UAS, T,). If the 
message has been sent, then the information in the UAS for its location corresponds exactly to the 
information in the message, i.e. it describes what actions are known to be committed, aborted, or active as 


a result of the message. 
Event Mapping 


h: &) & is defined as follows. Let ord6 be an arbitrary total order on &.: For each node, i, let loc(i) 
be a distinguished tloc whose home is that node. (We will use this tloc to define an explicit “sender” for 
messages from that node. If such a tloc does not exist, then it could be created just for this purpose.) 


h:createA(d) -» create A * <<{@create[Z,a] A,d: 8 = creator(A), home(a) = home(A)}; ord6>> 
commit A(d) -* commit A * <<{@commit[f,a] Ad: 8 = A, home(a) = home(8)}; ord6>> 
abort A — abort A *<<{@abort[Z,a] A: 8B = A, home(a) = home(B)}; ord6>> 
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perform A,u(d)—> perform A,u * <<{@commit[B,a] A,d: B = x, home(a) = home(8)}; ord6>> 
h(send M) is defined as follows: 


If M = #create(ij) A,d, then h: send M — @create[loc(i),M] A,d 
IfM = #commit(ij) A,d, then h: send M - @commit{loc(i),M] A,d 
If M = #abort(ij) A, | then h: send M — @abortfloc(i),MJA 


h(receive M) is defined as follows: 


IfM = #create(ij) A.d, then h: receive M -—> <<{@create[M,a] A,d: home(a) = j}; ord6>> 
If M = #commit(ij) A,d, then h: receive M —> <<{@commit[M,a] A,d: home(a) = j}; ord6>> 
IfM = #abort(ij) A, then h: receive M = — <<{@abort[M,a] A: home(a) = j}; ord6>> 


We map local events to the corresponding local event at Level 6, followed by a sequence of 
communications events that "inform" all other locations based at the same node of the event. (Note that 
we use the qualifier "(d)" on local events at Level 7.) We map a send event to a communications event at 
Level 6 with the message slot as the destination. (The "sender™ at Level 6 is an arbitrarily selected tloc at 
the sending node.) We map a receive event to a sequence of communications events at Level 6 with the 
message slot as the sender, and all tlocs at the receiving node as receivers. In gencral we map a single 
per-node event which affects the node’s state to to a sequence of per-location events -- one for each tloc 


whose home is that node. 
10.3 Proof of Possibilities Map for hy, 
We now show that h is a possibilities map. 
Lemma 10.3.1: h preserves initial states. 
Proof: Let h(Dj) = <L,V>; then 


V = valuemap(D,) = V>, and 

L{a) = T,, ifa € Msgs, since D,buf = @ 

Ifa € tloc, then L{a) = Dy-home(a)] = T,. - 
Thus L=L), & 
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Lemma 10.3.2: h preserves transitions. 


Proof: We must show that if D € PRE(e) a %,, and h(D) € PRE, (h(e)) nN %,, then h(De) 
= h(D)h(e). 


Let De = D1, h(D) = <L,V>, h(D1) = <L1,VD, and <L,V>h(e) = <L’,V’>, then we must 
show that L] = L’, and V1 = V’. 


We argue the cases e = create A (d), ¢ = send M, ande = receive M for M = &createi,j) 


A,d. Other cases are similar. 


1. e = create A (d). Let 8 = creator(A), i = home(8). 
h(e) = create A * <<{@create[B,a] A,d: home(a) = i}; ord6>>. 


From transitions T7.1a,b, we have 
D1.buf = D.buf, Dl.j = Dj Vj #i, 
D1.iv = Div, 

# vertices, [i] = # vertices, [i] U {A}, 
# status, [iKA) = ‘active’. 


Thus V] = V, and Li(a) = L{a) VWa € tocfi). If a € tlocfi), then 
@vertices, [a] = @vertices, [a] U {A}, and @status, ,[a}(A) = ‘active’. 


By inspection all events in h(e) only affect locations in tloc(i); thus L’(a) = L{a) 
= Ll(a) Va € tloc(i). 


Define relation + on tloc(i) as follows: al++ a2 = al = £, or @create([B,a]] 
A,d precedes @create[B,a2] A,d in ord6. (++ is reflexive.) Let <L2’,V2'> = 
<L,V>u, where u is the prefix of h(e) up to and including event @create[B,a2]} 
A,d. We can show inductively that 

@aborted, ,{a2], 

V2°(a,A) = V(a,A) Var E obj(i), A € act, 

alt+a2 = A € @active, al], 

~ (alr a2) = A € @vertices, fal]. 


(We will not carry through the details of the induction here. The only subtle 
point is that event @create[B,a2] A,d cannot affect V(a2) (if a2 € obj(i)): If B 
€ V(a2), then by Lemma 8.3.1a, B is live in L{ax); thus B cannot be a descendant 
of an action in d. Note that we can apply 8.3.la because we know <L,V> € %,, 
and 14 is invariant in L6 by Lemma 9.5.2.) 


By applying the inductive result to the total sequence h(e), we conclude that 


V = Vand La) = L{a) for all a in tloc{i). Thus V1 = V’, and L] = L’. 
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2. e = send M, M = A create(ij) A.d. 
h(e) = @createfloc(i),M] A,d. 


D1.buf = D.buf U {M}; D1i = D.i Vi € Nodes. 


Since valuemap(1D) does not depend on D.buf, V’ = V. But M € obj, so h(e) 
cannot affect V (in L6), = V1 = V. Thus V1 = V’. 


Obviously Ll(a) = L(a) unless a = M. But D1.i = D.i for all i € nodes, and if 
M’ # M, then M’ € D,buf = M’ € Dl.buf. Thus L’(a) = L(a) for all a # M. 


For a = M, L’(a) = T,, where 
vertices. = {U,A} Ud 
committed, = 

aborted. =d 


Let Ll(a) = T1, L{a) = T, then 
vertices, = vertices, U {A}Ud 
committed,, — committed, 
aborted, = aborted, U d 


But if M € D.buf, then T= T = T1=T. If M € D.buf, then T = T, = Tl 
=T. 


Thus L1 = L’. 


3. e = receive M, M = #create(i,j) A,d. 
h(e) = <<{@create{M,a] A,d: home(a) = j}; ord6>>. 


At node j, A is added to # vertices, fj] (and made active if not already there), and 
d is merged into # aborted,,Jj]; descendants of d are discarded from Dj.v. 


We show L] = L’ (the argument that V1 = V’ is similar). 

Let a € loc. If a € Msgs, then clearly L'(a) = Ll({a) = L{a). If home(a) ¥ j, 
then again L'(a) = Ll(a) = L{a). Otherwise let L(a) = T, Ll(a) = Tl, L’(a) 
= T. Then L{a) = Dj.l; L’(a) = D1 j.l. 

Thus vertices,. = vertices, U {A}, aborted,, = aborted, U d (from transitions 
T7.a,c). 


But T1 differs from T by the effects of the message @create[B,a] A,d, in h(e), 
which has identical effect (from transitions T6.a,c), = Tl = T. Thus L] = L’. 


Lemma 10.3.3: h preserves preconditions. 


Proof: We must show that if D € PRE(e) N %,, and h(D) € ‘Ke, then h(D) € PRE, (h(e)). 
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Let h(D) = <L,V>. 


We argue the cases e = perform A,u (d), e = send M, and e = receive M for M = 


# create(i,j) A,d. Other cases are similar. 


1. e = perform A,u(d). Let x = object(A), i = home(x). 
h(e) = perform A,u * <<{@commit[x,a] A,d: home(a) = x}; ord6>>. 


First we show that <L,V> € PRE,(perform Au). L(x) = Dil, V(xa) = 
D.i.v(x,a), by definition of h. 


a AE # active, Jil, by P7.4a, 
= A € @active, [x]. 


b. A € prop-desc(D.i.v(x).holder), by P7.4b, 
= A € prop-desc(V(x).holder). 


c. u = D.i.v(x).value, by P7.4c, 
= u = V(x).value. 


d. anc(A)/N # aborted, Ji] = @, by P7.4d, 
= anc(A) M @aborted, [x] = 9. 


Now let e’ be the prefix of h(e) preceding @commit[x,a] A,d (for some a whose 
home is i), and let <L1,V1> = <L,V>e’ (in L6). We show that <L1,VD € 


PRE,(@commit|x,a] A,d): 


a. We must show that A € @committed, ,[x]. But event perform A,u 
must be ine’, =» A € @committed, [x]. 


b. We must show d = @aborted, [x]. But d = #aborted)fi] by 
P7.4e, = d = @aborted, [x]. 
But none of the events in e’ can change @aborted, [x] (perform A,u 
obviously does not change @aborted, [x], and if event @commit{x,x] 
A,d occurs in e’, then d © @aborted, [x] already). Thus d = 
@aborted, , [x]. 


2. e = receive M, M = #create(ij) Ad. 
h(e) = <<{@create{[M,a] A,d: home(a) = j}; ord6>>. 


None of the events in h(e) affect L(M), and the precondition for each event 
@create{[M,a] A,d depends only on L{M). Thus it suffices to show that <L,V> € 
PRE, (@create[M,a] A,d) for all a. 
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But M € D.buf by P7.6a, so L(M) = T, where 
vertices, = {U,A} Ud, 

committed, = @, 

aborted, =d. 


Thus 
a. A € @active, [M] 


b. d € @aborted, [M] 


3. e = send M,M = A create(ij) A,d. 
h(e) = @createfloc(i),M] A,d. 


home(loc(i)) = i, by definition, so L(loc(i)) = D.i.l. 


a. A € #activepfi], by P7.5a, 
= A € @active, [loc(i)]. 


b. d= #aborted_fi], by P7.5b, 
=> d= @aboried, [loc(i)]. 


Theorem 10.3.4: h is a possibilities map. 


Proof: By Lemma 10.3.1, h preserves initial states. By Lemmas 10.3.2 and 10.3.3, and Lemma 


4.2.2.6, h preserves events. Thus h is a possibilities map. & 


10.4 Mapping from Level 7 to Level 0 


We can now prove the main theorem of this thesis: 


valid execution sequences of our 


lowest-level model (Level 7), when suitably interpreted, generate only view-serializable action trees. 


Main Theorem: Let g: 8, —> 6) be defined by 


& = hjgeh,seh.,ch,,°h,,%h. hyo 


Let v € 1, be some valid execution sequence in 17. Then T,g(v) is a view-serializable action 


tree. 
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Proof: We have shown that each Di is a possibilities map (Theorems 6.4.4.1, 6.5.2.1, 7.4.5, 
8.4.5, 9.4.5, 9.5.1, and 10.3.4). By Lemma 4.2.2.5, each h. 


43452 valid interpretation. By 


repeated application of |.emma 4.1.3.2, g is a valid interpretation from L7 to 1.0. Thus v € ¥%5 


= g(vy€ 1%: By Lemma 6.1.1, Tg(v) is view-serializable. 
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11. Conclusions 


11.1 Summary and Evaluation 


We have presented a detailed proof that a particular transaction system model satisfies our 
definition of internal consistency. The proof was structured on several levels, corresponding to different 
levels of abstraction of the transaction system. While the lowest-level model is still quite “abstract” in 
that it is far removed from an actual implementation, we feel that it captures many of the basic design 


decisions made for the Argus transaction system. 


We believe our work has made two contributions: First, we have formalized internal 
consistency and we have related this formal condition to a particular orphan detection strategy. Second, 


we have explored a method for multi-level correctness proofs which might be useful in other contexts. 


11.1.1 Orphan Detection and Internal Consistency 


Our definition of view-serializability appears to be a useful condition for internal consistency. 
In the development of the Argus orphan algorithm, designers have often relied on particular scenarios 
where inconsistencies arose to justify the need for including certain information in messages (or writing 
certain information to stable storage.) While this type of reasoning can demonstrate shortcomings in the 
algorithm, it cannot prove the algorithm correct (we cannot "prove by example.”) Perhaps the results of 


this thesis, and future extensions of these results, can partly subsume this “reasoning by scenario.” 


Although we have ignored crashes in our system models, the view-serializability condition 
appears to be applicable in an environment with crashes. We have applied this condition to scenarios of 
inconsistencies in Argus which result from crashes; these inconsistencies can be explained by showing 
that an action does not have a serializable view tree. (Since view-serializability is a sufficient condition for 


internal consistency, any inconsistency should be explicable by the absence of a serializable view tree.) 


11.1.2 Algebraic Models 


The multi-level structure of our correctness proof yields at least two benefits: First, since 
adjacent levels are generally closely related, the possibilities maps (and proofs of possibilities maps) 
between adjacent levels are relatively simple. Although we employ many levels, overall complexity is 
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reduced and understandability of the mappings is enhanced. 


Second, because the higher-level models are more abstract, they might prove to be useful 
abstractions of different implementations. At Level 1 we describe the ANC-ABORT property, at Level 2 


we describe a specific orphan detection precondition, and only at Level 5 do we explain how this 


« detection is carried out locally by piggybacking aborts lists onto messages. A different orphan strategy 


could be described at lower levels, but the higher-level models might still apply. As a trivial example, if 
all orphans are always exterminated immediately, then it is easy to show that condition ANC-ABORT 
from Level 1 is satisfied. Thus the correctness proof from Level 1 could be carried over to a system using 
immediate extermination. As another example, if we change the specific information piggybacked onto 
create and commit messages at Level 5 (for example, we might choose to send only a covering subset of 


the known aborts set) then the Level 4 model might still apply. 


Our notion of "homomorphism" is unusual in that we allow “possibilities” mappings to sets of 
states at higher levels. This approach allows us to explain the “auxiliary state" technique as a particular 
kind of possibilities map. For our algebra hicrarchy, we used a multiple-state augmention mapping 
between Levels 6 and 5. We speculate that the use of possibilities maps instead of auxiliary state variables 


might simplify some correctness proofs. 


11.2 Directions for Further Research 


The application of formal techniques to distributed transaction systems is a vast topic; we limit 


our discussion to three possible extensions of our work. 


11.2.1 Crashes 


The most glaring deficiency of our model is that we do not consider node crashes. Node crashes 
are a more difficult problem than explicit aborts because the orphans created by a node crash might be 
ancestors (or relatives) of actions which ran at the crashed node and committed. The “infected” ancestor 
can commit arbitrarily far up the action tree before the crash is discovered (though it will eventually be 
caught at the top level during two-phase commit if it is not caught sooner). 


The (visible-data-closed) view tree which we used to prove view-serializability for the explicit 
aborts case will not work for a crash model. It is possible that a datastep can be “visible” to another access 
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since it has committed to their least common ancestor, but the effect of this datastep might have been 
undone by acrash. Consider the tree of Fig. 11.1, for example. Object x has initial value 0. Action A 
spawns concurrent children Al and A2. Action Al runs, increments x, and commits to A. Then x’s node 
crashes, allowing A2 to get a lock on x. Action A2 cannot see the effects of Al, because x’s node crashes 
after Al commits to A. A2’s view is consistent, because there exists a serializable view tree for A2, but 
this view tree does not include Al. (A2 is an orphan, because A is an orphan, but A2 is not yet a “bad" 
orphan.) Note also that if A2 commits to A, then A’s view becomes inconsistent. Thus an orphan 
detection strategy for a crash model must place restrictions on the commit of actions; for the explicit 


aborts case, we have shown that it is sufficient to put a precondition on perform events. 


We speculate that a high-level notion of “depending on a crash” could be developed to parallel 
our notion of depending on an abort, and that a sufficient condition for view-serializability could be 
expressed in terms of these dependencies. Piggybacking of crash count information would appear at 
lower levels. A better approach would be to somehow unify aborts and crashes (i.e., treat them both as 


particular cases of a higher-level event), but we have made little progress in this direction. 


Fig. 11.1. Consistent View of Orphan Arising from a Node Crash 


U 
; a 
Al,c A2 
x,0 x,0 
| 


(x's node crashes 
after Al commits) 
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11.2.2 Lower-Level Models . 


Although our lowest-level model is “distributed,” it ignores many of the optimizations and 
complications of a real orphan detection algorithm. A more satisfying “correctness proof" would extend 
our bottom level to even lower-level models which are closer to a real design. At least two areas for 
refinement may be explored: First, since the system history of aborts will grow without bound, any 
operational orphan algorithm will not send DONE in entirety on each message. Reducing this overhead 
will require some connection information or garbage collection scheme (perhaps using some variant of 
orphan expiration [Nelson81]). It would be useful to prove that these modifications are indeed 


optimizations in that they do not violate internal consistency. 


Second, our mode! describes the possible flows of information, but it does not describe strategies 
for actually sending messages. (For example, do actions inform descendants immediately when they 
commit, or do they answer to queries from descendants?) Since our work focuses on correctness of 
reachable states, we have been able to ignore these questions. Of equal interest to designers, though, are 
properties of liveness (for example, will a commit message ever arrive) and bounds on delays. 


Formalization of these properties might require fundamentally different mechanisms. 


As lower-level models become more detailed, they will approach specifications for the programs 
of a transaction system. At this point the boundary blurs between these correctness proofs and program 


verification. 


11.2.3 User-Defined Atomic Data Types 


We have limited the objects in our model to simple atomic objects implemented using mutual 
exclusion locks and a stack of versions. For some applications these objects might be inefficient: different 
implementations of atomic objects might provide additional concurrency or a more efficient backup and 
recovery mechanism. As explained in [Weih182], the “atomicity” of a data type depends on the semantics 
of the operations available to users of that type. As a trivial example, if a type is "immutable" (none of | 
the operations change the abstract object), then it is automatically atomic. Our serializability condition is 
insufficient to describe this more general notion of sicenickey: 


More general “user-defined” atomic types can be constructed from basic atomic objects (like 
those in our model) and completely nonatomic objects (which provide no synchronization or recovery). 
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(Again, see [Weih!82] for examples of these constructions.) Because the effects of aborted actions might 
not be undone, undetected orphans can violate external consistency through non-atomic data (with 
catastrophic effects). Thus an orphan detection strategy is more important for systems which allow 
non-atomic objects. Although orphan detection does not guarantee view-serializability for systems with 
non-atomic objects, it might guarantee weaker properties which are useful to programmers trying to use 
non-atomic objects to construct atomic types. We have begun to explore these properties (and more 
complex models which incorporate general atomic types). For example, it is relatively easy to show that 
the orphan detection strategy we have modeled constrains the order of datasteps on a non-atomic object 
to be consistent with the sequence ordering. (Without orphan detection, even this weak condition might 
not hold.) 
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Appendix I - Notational Conventions 


Fig. 1.1. Conventions for Figures 


The action tree, T, is usually implicit 


A,c --- A € committed, A,a  --- A € aborted, 

A 

WA --- A = parent(B) 

B 
A 

/ --- A € prop-anc(B) 

B 
A 
é 

v --- A € anc(B) 

B 
A 

U7 ---  prop-anc(B) M prop-desc(A) € committed, 

B 

A =. B --- (A,B) € seq, 


A —— > B --- (A,B) € data; 


-173- 


Fig. 1.2. Cross-Reference of Invariants to Lemmas 


Invariant Symbol Lemma(s) 
Ia 6.3.1.1.2, 6.3.1.1.4, 6.3.2.2, 6.3.3.1, and 6.3.3.2 
Ja 6.3.1.1.1, 6.3.1.1.3, 6.3.3.3, and 6.3.3.4 
Sa 6.3.1.2.1 through 6.3.1.2.14 
13 7.3.2 
J3 7.3.1 
14 8.3.1 
14' 8.3.1 except for 8.3.1e 


15 9.3.3 


[Best81] 


[EGLT76] 


[Liskov82] 


[Lynch82] 


[Moss81] 


[Nelson81]} 


[Papa79] 


[Reed78]} 
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